Best Practices For Building A Multiple GPU System

Tue Mar 21, 2017 8:04 am

Tutor wrote:
smicha wrote:You are the man, Tutor. I am sorry if I repeat myself

Is my drawing correct so?
Yes - Your drawing is correct. Your risers are same size as mine and many of my 90+ risers of that type came from the very same source that you referenced last week.

I'll get them more soon. For now I need to wait for my custom designed watercooling parts to be produced. 2-3 weeks and I'll show you what I got

Tue Mar 21, 2017 3:09 pm

milanm wrote:Compilation and upload times should not be underestimated. Sometimes It's better to have more machines and licenses. Running two instances of Octane with different GPUs assigned to each on the same machine can result in 2X faster compilation (depending on the CPU ofc.). That also works with C4D for me.

Regards
Milan

Gold! Gold!
Gold-Gold-Gold-Gold-Gooooooooooooooaaaaaaaaaaaaaaald!

Thanks a lot for the explanation of how the SA works vs plugin. I understand what you are saying, and I never thought to do any of that with the subdivision, so these are all very great tips indeed

My rig can bang out frames pretty fast, but any non-GPU element involved is easily the thorn in the process, so I appreciate this all greatly. Nice!

Wed Mar 29, 2017 2:56 pm

Tutor, a philosophical question for you....
Which would you rather have, if 'speed' of production would be equal and cost would remain same:
(1) a revolutionary technology appears created by someone else that makes it completely unnecessary to have rigs (GPUs, cooling, hardware, etc), and you would easily have access to the new technology to such a degree that you could still accomplish all your work, or
(2) you achieve a learned personal breakthrough where you solve your own rig's questions and discover a new potential / ability to run everything (GPU number, power, speed) that and as you ever wanted, using your same gear

Thu Mar 30, 2017 2:16 am

Notiusweb wrote:Tutor, a philosophical question for you....
Which would you rather have, if 'speed' of production would be equal and cost would remain same:
(1) a revolutionary technology appears created by someone else that makes it completely unnecessary to have rigs (GPUs, cooling, hardware, etc), and you would easily have access to the new technology to such a degree that you could still accomplish all your work, or
(2) you achieve a learned personal breakthrough where you solve your own rig's questions and discover a new potential / ability to run everything (GPU number, power, speed) that and as you ever wanted, using your same gear

Since you posit that speed and cost of production would remain the same, that makes no. 2 especially more desirable if the time to breakthrough wasn't excessive, given that I'm driven by the need to always learn new things and to apply that knowledge. Moreover, I develop sentimental/familiarity attachments to my gear and I enjoy tweaking my gear to keep it working and to make it faster and more featureful than it was upon my aquisition of it - no matter how old it is - like my self-upgraded, self-repaired and still used Video Toaster 040s, Atari 040s and classic Macs.

Thu Mar 30, 2017 8:06 am

teknofreek wrote:Thanks for the reply Tutor. I've just received the amfeltec switch, so I'll see how that works..
Unfortunately I need to run a Wacom tablet and 3D Connexion spacenav, so maybe they're also taking some IO ( don't really know though).
I might try to remove the MSI 680 lightnings from the main MB. They are not branded ti, but maybe are hogging resources as most of this problem came about when I replaced the 660ti's with those.
Anyway thanks again.

Hi,
Just a little note on riser (USB3 1X) or other splitter and clusters. I have the 3 types :
1) 1X PCIe USB3
2) 4x splitter (4 GPU) (4X PCIe)
3) Dual Cluster amfeltec (2 x 4 GPU) (4X PCIe)

The first is compatible on my 3 rig.
The second, I have big problem with my Asus P9X79 deluxe and octane V3 : freeze
The third, no pb yet.

About speed loss since octane V3 (more comminucation with host CPU and RAM), I have here some test from my scenes :

Test with an 980Ti:
- inside 16X PCIe : 7min16s
- with cluster (only the 980Ti) : 9min35
- with splitter (only the 980Ti) : 10min49
- with USB3 connected to 1X PCIe : 8min45

The scene is 4K size in PT kernel (parallel samples : 16, max tile : 32, adaptative) and V3.0.6 test4.

Render with network (slave) is better, but not test yet with the same scene and the 980Ti.

Sun Apr 02, 2017 11:44 pm

itou31 wrote: 2) 4x splitter (4 GPU) (4X PCIe)

The second, I have big problem with my Asus P9X79 deluxe and octane V3 : freeze

Hey itou31, have to ask, is this with tonemapping turned off on the splitter cards? Is there a mobo-attached card that can run the tonemapping, or can you spare a USB riser and have one card attached to the Asus P9X79 on a riser (probably won't crash upon using tonemapping...)

Mon Apr 03, 2017 4:57 pm

This was a good expalnation from the Lightwave & Houdini plugin developer:

Re: PCI-E Speed?
Postby juanjgon » Sun May 01, 2016 4:17 am
Octane 3 has change the samples integration and frame buffers architecture, that is now done in the CPU RAM. This makes possible render large resolution images and a lot of passes without fill the GPU VRAM, but the problem is that there are a lot more traffic in the PCIe buses. The PCIe 1x buses are not fast enough, so a 10% to 30% performance hint has been reported while working with GPUs using low speed PCIe buses.

-Juanjo

Mon Apr 03, 2017 5:41 pm

itou31 wrote: Hi,
Just a little note on riser (USB3 1X) or other splitter and clusters. I have the 3 types :
1) 1X PCIe USB3
2) 4x splitter (4 GPU) (4X PCIe)
3) Dual Cluster amfeltec (2 x 4 GPU) (4X PCIe)

The first is compatible on my 3 rig.
The second, I have big problem with my Asus P9X79 deluxe and octane V3 : freeze
The third, no pb yet.

About speed loss since octane V3 (more comminucation with host CPU and RAM), I have here some test from my scenes :

Test with an 980Ti:
- inside 16X PCIe : 7min16s
- with cluster (only the 980Ti) : 9min35
- with splitter (only the 980Ti) : 10min49
- with USB3 connected to 1X PCIe : 8min45

The scene is 4K size in PT kernel (parallel samples : 16, max tile : 32, adaptative) and V3.0.6 test4.

Render with network (slave) is better, but not test yet with the same scene and the 980Ti.

Thanks for information.

Mon Apr 03, 2017 7:17 pm

Notiusweb wrote:
itou31 wrote: 2) 4x splitter (4 GPU) (4X PCIe)

The second, I have big problem with my Asus P9X79 deluxe and octane V3 : freeze
Hey itou31, have to ask, is this with tonemapping turned off on the splitter cards? Is there a mobo-attached card that can run the tonemapping, or can you spare a USB riser and have one card attached to the Asus P9X79 on a riser (probably won't crash upon using tonemapping...)

I don't remember this episode, but haven't test again the splitter with my P9X79 with lastest V3.0.6 as it is well integrated with the cluster, and the splitter is on my MSI 1155 socket.
And with V3 and above (i think), it's not worth to go with 1X PCIe as you lost 10% to 30% as I experience myself.
And with some scene, I noticed that adding the 7th GPU (Titan), it adds only a little to rendertime, and adding the 8th GPU (still Titan), the rendertime doesn't decrease.
So networking seems better, but need one more license.

Tue Apr 04, 2017 1:49 pm

itou31 wrote:
Notiusweb wrote:
itou31 wrote: 2) 4x splitter (4 GPU) (4X PCIe)

The second, I have big problem with my Asus P9X79 deluxe and octane V3 : freeze
Hey itou31, have to ask, is this with tonemapping turned off on the splitter cards? Is there a mobo-attached card that can run the tonemapping, or can you spare a USB riser and have one card attached to the Asus P9X79 on a riser (probably won't crash upon using tonemapping...)
I don't remember this episode, but haven't test again the splitter with my P9X79 with lastest V3.0.6 as it is well integrated with the cluster, and the splitter is on my MSI 1155 socket.
And with V3 and above (i think), it's not worth to go with 1X PCIe as you lost 10% to 30% as I experience myself.
And with some scene, I noticed that adding the 7th GPU (Titan), it adds only a little to rendertime, and adding the 8th GPU (still Titan), the rendertime doesn't decrease.
So networking seems better, but need one more license.

Hi itou31, with your 4x splitter (4 GPU) (4X PCIe), how were you able to isolate the speed of the cards if they cause a freeze. Is it that the freeze happen sometimes and not others?

One thing I noticed is that with 'adaptive sampling' turned on, the 16x cards benefit the most. Also, it stresses the traffic on the 1x, as I looked at a DPC latency when running adaptive sampling and it does cause more delayed calls on the Nvidia driver.