Best Practices For Building A Multiple GPU System

Wed Sep 23, 2015 7:07 am

Morning coffee with points 1 to 10 was a joy!

Sat Sep 26, 2015 1:03 am

smicha wrote:Morning coffee with points 1 to 10 was a joy!

Time for lots more coffee - Caffeine For Thought

Here’s one more reason to consider using servers, especially dual (or higher) CPU versions, for building massively GPU populated systems.

On the one hand, the more processors a server supports, the more system memory it’ll support, typically for OS and application operations. One the other hand, IOMem operations use system ram (during boot and afterwards) and each GPUs in a system must have its required amount of that ram to operate properly. That's why GPU fanatics like me need "Above 4G" functionality.

With the advent of the Pascals, Nvidia is expected to reach 32G GPU territory. Moreover, there are indications that, at least, the Titan X (a 12G Maxwell)*/ uses almost twice the IOMem space of certain Kepler cards.

I just received a postcard from Magma, advertising their ExpressBox 3600 External Chassis. Reviewing the FAQ for the ExpressBox 3600-10, here’s the opener for the first of their “Frequently Asked Questions” [ http://magma.com/products/pcie-expansio ... sbox-3600/ ] :

“What is the memory requirement for a certain host computer when using the EB3600 unit running GPUs?

It is recommended that systems be configured with a minimum of 8 GB of system memory, with a suggestion of 16 or 32 GB is appropriate for most of today’s applications and operating systems.
GPUs require additional system memory to support memory copy operations, and the system memory should add to the base memory configuration, whatever BAR (Base Address Register) space memory that each GPU requires. An NVIDIA K40, for example, requires 16 GB of BAR1 space and if a base system is configured with 16 GB of system memory, the user should add 16 GB (32 GB total) of system memory PER K40 added through expansion. If the user is configuring 4 K40s, for instance, this would require 64GB additional system memory above and beyond what the base operation system and application requires. For 4 K40s and base system memory of 16GB, this would be 80Gb of total system memory to support the base application as well as the GPU memory requirement.
NOTE: Always check the minimum system requirement of the GPU. This information is available from the vendors / manufacturers website.” {Emphasis added.}

A Tesla K40 is a single processor 12G card and yet it requires 16G of system memory (per card) to operate. Nine of them would require ( 9 * 16 = 144 ) 144G of system ram just for GPU operations alone. What might this mean for the future? How much ram will a system loaded with twelve top of the line 32G Pascals require just for their operations or, put another way, how high above 4G functionality will then become the norm for systems massively populated with top of the line Pascals, not to mention their successor - Voltas?

*/ I suspect, but haven't cared to confirm, that the GTX 780 Ti is also an IOmem hog, but certainly not on any scale approaching that of the Tesla K40.

Sat Sep 26, 2015 6:03 am

Fascinating, Tutor. Thank you so much for sharing.

Wed Sep 30, 2015 12:39 pm

@tutor... works perfect on my system... old 32 bit bios with the x8 supermicro board

Wed Sep 30, 2015 4:17 pm

thunderbolt1978 wrote:@tutor... works perfect on my system... old 32 bit bios with the x8 supermicro board

It's a bird. No, it's a plane. No, it's Thunderbolt1978 - a true SupermicroMan.

Thunderbolt1978,

Keep the information flowing here. Please make me stay after class often - the older that I get, the more that I love learning. Thanks for your pic showing that your x8 Supermicro motherboard with an "old 32 bit bios" and 48Gs of ram can handle many Tesla K40s. For me, this confirms the flexibility and superiority of Supermicro motherboards for building a multiple/massively populated GPU system over purchasing a special needs, extremely higher priced, proprietary, 3d rendering box.

Once upon a time, I used to believe that Supermicro motherboards were arcane, industrial, ugly, stodgy, featureless, overpriced motherboards. Then, for some reason (circa 2012) that now escapes my current recollection, I visited Supermicro's website and Googled the real-world applications to which Supermicro motherboards were being put. My opinion of Supermicro then began its 180 degree revolution. Now, I'm also a SupermicroMan. Boy! Do those mansions, mountains and seemingly endless roads look tiny now. Perspective matters much, but knowledge matters the most.

Thu Oct 01, 2015 8:18 pm

Prospects Of Pascal's Having A Monumental Impact On Multiple GPU 3d Rendering Systems

Since Nvidia’s prediction that Pascal is expected to be 10X faster than Maxwell is predicated on a doubling of performance by NVLink and since NVLink will not likely be active in the GTX lineup */, consequently Pascal GTX users aren’t likely to see any where near a 10X speed up. Hopefully, CUDA based applications will see at least a 2X speedup using GTX cards and possibly a 3-4 times speedup, depending on the application’s ability to take advantage of mixed precision computing and the 3d memory features of Pascal.

*/Resources:
http://www.techradar.com/us/news/comput ... t--1226772

http://www.techradar.com/us/news/comput ... ly-1237060

http://www.techradar.com/us/news/comput ... 2x-1237272

http://www.3dcenter.org/news/nvidias-gp ... -testphase

"confirmed: NVLink feature for coupling of GPUs with CPUs in supercomputing applications (not active in Gamer variants)"

http://www.3dcenter.org/news/nvidia-pascal

Thu Oct 01, 2015 8:44 pm

... and something new
http://www.t3.com/news/nvidia-is-about- ... phics-card

Sat Oct 03, 2015 6:34 am

smicha wrote:... and something new
http://www.t3.com/news/nvidia-is-about- ... phics-card

Thanks Smicha for update.

Looks sort of like the recent past replayed. Titan Z wasn't quite the performance equal to two Titan Blacks. Moreover, Titan Z initially was ridiculously priced initially. I hope that this initial pricing part isn't replayed. Whatever Nvidia has/will decide(d) to call this new dual GPU monster, hopefully because of improvements in efficiency per watt, it's be closer to the equal of 2xTitan Xs in performance, but, for the near term at least, putting two GPU processors on the same board will create more heat than one GPU processor of the exact same kind so expecting to get an exact doubling of performance is not likely to be realized. When this new monster goes on sale, we won't know for sure how it'll stack up in rendering performance against a comparably priced Pascal (if there is a comparably priced Pascal with comparable performance). There is a risk that in about 6 months, a Pascal will drop that is faster and costs less than this new monster. If this introduction pattern holds for the Pascal, then about this time next year we be faced with the same conundrum.

Also, Nvidia appears to be running out of ending alphabets, but if one considers that "Z" is the last one and "X" precedes "Z" (but skips "Y"), will this new monster be called Titan "Y" as in why would one buy it at this time except for extreme need or status display or will it be called the Titan "W" as in "What the ---- is Nvidia doing?" Then again, it could be intended to drive another hole into AMD's coffin, so why not call it Titan A.

Sat Oct 03, 2015 8:22 am

Tutor,

These are exactly my thoughts

Naming - was thing about XZ

Heat - when I di 2xZs on water for Piotrek (build log here viewtopic.php?f=9&t=44831&p=241306&hili ... +Z#p241306) the waterblocks were/are indeed much hotter and temp for gpus are aroun 50C during summer. Their performance is same as for 2 780Ti but on water only.

PS. Putting my hands into something fancy

viewtopic.php?f=11&t=50540

Sat Oct 03, 2015 4:02 pm

smicha wrote:Tutor, ...
PS. Putting my hands into something fancy
viewtopic.php?f=11&t=50540

I liii*****ke it.