10 GPUs - 53 sec, 20 GPUs - 43???

Wed Mar 20, 2019 3:02 am

Hi, We have two slaves each holds 10 GPUs
One has 2080Tis, the other one Titan X Pascal.

10 x 2080tis 53 seconds, but 20 GPUs only 43. Should it be around 30? I am giving the system almost twice more power.

Wed Mar 20, 2019 12:05 pm

you forgot to calculate speed of GPUs

1. 2080 Ti has OB score 305-310 so 10*307 = 3070 OB Total score
2. TitanxP has OB score of 250 so 10*250 = 2500 OB score

So in percentages 2500+23%=3075 which means your 2080Ti setup should be 22-23% faster with same number of GPUs just because faster GPUs (20180ti vs TitanxP), also speedup is not exactly linear especially 'coz your render slave GPUs starts rendering slower 'cot main workstation needs to send all the data through network so in case you measure that time on still/first frame it's gonna be noticeable time lag until slave get's all the data over network for the first time.

P.S. What motherboard you have to be able to run 10 GPUs in one system ? Thanks.

Wed Mar 20, 2019 12:51 pm

Lewis is right, I have experienced the same kind of thing. In general it actually takes more time to feed data to and from more GPUs, so it is not linear.
There will be diminishing returns after a certain # cards. You may find it closer to linear maybe at 13, 14, 15, and then drop off at 17, 18, 19, for example...

But, another giant factor to really troubleshoot with such thing is the hardware arrangement.
Are they on extenders at 1x or 4x, or, are they all on a same motherboard at 8x+?
This makes a huge difference as well. So, Lewis is especially right on that point too - what is the motherboard, and the OS.
Linux and Win 10 are not the same with handling and processing of multi-GPU data in smaller and smaller time scales.

I myself found that if your arrangement is re-tweaked hardware-wise, it can definitely lead you to different speeds. It may not yet be optimized as far as raw speed goes.
(LOL...I always go crazy with seconds difference because on thousands of frames it adds up!)

Wed Mar 20, 2019 1:54 pm

Thank you all.
We use Supermicro SYS-4028GR-TRT2.
Also processing time is only 5 seconds per frame.

Wed Mar 20, 2019 1:59 pm

Good day,

network speed is important. I would try to see what You have and if You can upgrade it somehow (10G is recommended if You go with higher resolutions)

would be good to open something like MSI, and take a look into how GPUs are utilized compared to your main machine (I bet You under-utilizing those cards).

Last but not least, there is this parameter: Minimize Network Traffic (In Kernel settings under Sampling tab) - this could help nodes to talk more efficiently.

Thu Mar 21, 2019 4:40 pm

glimpse wrote:Good day,

network speed is important. I would try to see what You have and if You can upgrade it somehow (10G is recommended if You go with higher resolutions)

would be good to open something like MSI, and take a look into how GPUs are utilized compared to your main machine (I bet You under-utilizing those cards).

Last but not least, there is this parameter: Minimize Network Traffic (In Kernel settings under Sampling tab) - this could help nodes to talk more efficiently.

Thanks, It is 10 GB, 16xPCIe speeds bus to each GPU and we only render animation at 1080P. So 3 seconds is lost for all the data transfer. Raw rendering time is 50 sec for 10 GPUs and 40 for 20 GPUs

Thu Mar 21, 2019 4:44 pm

I also noticed denoiser makes rendering way slower. (During actual rendering and not denoising stage). Hopefully this can be addressed in the future

Sat Mar 23, 2019 4:24 am

So I did some tests with 20 gpus and 10 gpus rendering just letters on the screen . Both 20 and 10 gpus results were 2 seconds per frame. So the most amount of time went towards sending data and processing. I am not sure how they got brigade to do 60 frames per second.

Sun Mar 24, 2019 3:31 pm

So, the best efficiency way is to render several range of frames on each machine without net rendering…

Tue Mar 26, 2019 10:15 pm

It is pointless to test the difference with frames that render 2 seconds. Test with frames which render around a minute and you will surely see the difference.