karu wrote:jimho wrote:regard to the nvlink,
it shows much more solid than last version, I think it is a big progress, very happy to see and thanks Otoy,
there are 4 RTX2080ti in my work station, though both are physically linked by bridges, there are only one pair are recognized and configured as sli, since there is limitation from Nvidia's driver,
I tried quite a few times with a big scene, each time the linked pair works,
though the other two normal GPU fails (which are not shown as linked) , only when I uncheck them from the "use priority" box, I got this:
NVlink_test3.jpg
RTX Pool.JPG
means the stability for the combination of linked and unlinked GPU is still vulnerable
hope the nvlinked pair could work togethet with other GPUs soon.
Jim
Hi Jim,
When rendering with multiple GPUs, the entire scene data needs to be able to fit on each GPU (or pair of GPUs, if you're using NVLink). So even though you're using NVLink with two of your GPUs, the other two GPUs each still need to hold all the scene data in order to render. (This excludes any data that goes to out-of-core memory, which can be accessed by all GPUs.)
Since GeForce cards no longer support multiple NVLink pairs, this means you will only see a benefit (in terms of preventing render failures) from NVLink if the number of 2080 Tis you're using for rendering is exactly two (because the point at which we start using memory pooling is exactly the point where the non-pooling cards will run out of VRAM and fail). For scenes where you can successfully render without NVLink but using out-of-core memory, you may see a speed benefit by enabling NVLink because those GPUs will be able to use peer memory to replace some of the slower out-of-core access that the other GPUs will have to do.
Unfortunately I don't think there's anything we can do to address the failures, but I hope this explains what you're seeing.
Hi Karu,
Thanks for your respond and the explanation, Generally it make sense to me, only a small question,
According to your articulation how the multiple GPUs are working, when OOC is on they should be working though slow but not failure, this is my understanding.
though the current situation is:
Even when OOC is on, not every time the non-nvlinked GPU will work, as mentioned: only when I check off the "use priority" box for them, they may work, yet still seems not so stable.
my question is: can ooc and nvlink work together stablely,
in theory the nvlinked pair is just like a big GPU (with more memory), the situation might be similar with mixing different sized GPUs
Consider when we mix different Vram sized GPUs, the situation may be as below:
1) before ooc, when the scene is exceeded the smaller GPU's Vram they will fail (Obviously, that is why we need OOC)
2)with ooc,
2.1) should the small GPU still fail meanwhile the big one keep running (just like before), OOC will start working when the big GPU exeed its VRAM
2.2) the small one could start using ooc, the big one keep running with its internal Vram,
2.2.1) further to the 2.2, when the big one is also running out of Vram a second mark point can be made, to let the big GPU know it's OOC starts here, only after this point the big GPU will go OOC,
Means different mark point made for different sized GPU to make them all utilize the system memory, they can share the data but will not conflict to each other.
Currently what we see is probably the case 1 and 2.1,
Ideally is it possible that we go the 2.2 scenario, if this 2.2 senario could work for mixed size GPUs the nvlink pair could act just like the big gpu, then ooc and nvlink could both work...
or is there a better possibilty that just let the GPUs not fail...
I am not a programmer, above thoughts just for your reference
Many thanks,
Jim