Best Practices For Building A Multiple GPU System

Discuss anything you like on this forum.
Post Reply
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Morning coffee with points 1 to 10 was a joy!
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:Morning coffee with points 1 to 10 was a joy!

Time for lots more coffee - Caffeine For Thought

Here’s one more reason to consider using servers, especially dual (or higher) CPU versions, for building massively GPU populated systems.

On the one hand, the more processors a server supports, the more system memory it’ll support, typically for OS and application operations. One the other hand, IOMem operations use system ram (during boot and afterwards) and each GPUs in a system must have its required amount of that ram to operate properly. That's why GPU fanatics like me need "Above 4G" functionality.

With the advent of the Pascals, Nvidia is expected to reach 32G GPU territory. Moreover, there are indications that, at least, the Titan X (a 12G Maxwell)*/ uses almost twice the IOMem space of certain Kepler cards.

I just received a postcard from Magma, advertising their ExpressBox 3600 External Chassis. Reviewing the FAQ for the ExpressBox 3600-10, here’s the opener for the first of their “Frequently Asked Questions” [ http://magma.com/products/pcie-expansio ... sbox-3600/ ] :

“What is the memory requirement for a certain host computer when using the EB3600 unit running GPUs?

It is recommended that systems be configured with a minimum of 8 GB of system memory, with a suggestion of 16 or 32 GB is appropriate for most of today’s applications and operating systems.
GPUs require additional system memory to support memory copy operations, and the system memory should add to the base memory configuration, whatever BAR (Base Address Register) space memory that each GPU requires. An NVIDIA K40, for example, requires 16 GB of BAR1 space and if a base system is configured with 16 GB of system memory, the user should add 16 GB (32 GB total) of system memory PER K40 added through expansion. If the user is configuring 4 K40s, for instance, this would require 64GB additional system memory above and beyond what the base operation system and application requires. For 4 K40s and base system memory of 16GB, this would be 80Gb of total system memory to support the base application as well as the GPU memory requirement.
NOTE: Always check the minimum system requirement of the GPU. This information is available from the vendors / manufacturers website.” {Emphasis added.}


A Tesla K40 is a single processor 12G card and yet it requires 16G of system memory (per card) to operate. Nine of them would require ( 9 * 16 = 144 ) 144G of system ram just for GPU operations alone. What might this mean for the future? How much ram will a system loaded with twelve top of the line 32G Pascals require just for their operations or, put another way, how high above 4G functionality will then become the norm for systems massively populated with top of the line Pascals, not to mention their successor - Voltas?

*/ I suspect, but haven't cared to confirm, that the GTX 780 Ti is also an IOmem hog, but certainly not on any scale approaching that of the Tesla K40.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Fascinating, Tutor. Thank you so much for sharing.
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
thunderbolt1978
Licensed Customer
Posts: 11
Joined: Wed Mar 11, 2015 9:20 am

@tutor... works perfect on my system... old 32 bit bios with the x8 supermicro board
You do not have the required permissions to view the files attached to this post.
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

thunderbolt1978 wrote:@tutor... works perfect on my system... old 32 bit bios with the x8 supermicro board
It's a bird. No, it's a plane. No, it's Thunderbolt1978 - a true SupermicroMan.

Thunderbolt1978,

Keep the information flowing here. Please make me stay after class often - the older that I get, the more that I love learning. Thanks for your pic showing that your x8 Supermicro motherboard with an "old 32 bit bios" and 48Gs of ram can handle many Tesla K40s. For me, this confirms the flexibility and superiority of Supermicro motherboards for building a multiple/massively populated GPU system over purchasing a special needs, extremely higher priced, proprietary, 3d rendering box.

Once upon a time, I used to believe that Supermicro motherboards were arcane, industrial, ugly, stodgy, featureless, overpriced motherboards. Then, for some reason (circa 2012) that now escapes my current recollection, I visited Supermicro's website and Googled the real-world applications to which Supermicro motherboards were being put. My opinion of Supermicro then began its 180 degree revolution. Now, I'm also a SupermicroMan. Boy! Do those mansions, mountains and seemingly endless roads look tiny now. Perspective matters much, but knowledge matters the most.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

Prospects Of Pascal's Having A Monumental Impact On Multiple GPU 3d Rendering Systems

Since Nvidia’s prediction that Pascal is expected to be 10X faster than Maxwell is predicated on a doubling of performance by NVLink and since NVLink will not likely be active in the GTX lineup */, consequently Pascal GTX users aren’t likely to see any where near a 10X speed up. Hopefully, CUDA based applications will see at least a 2X speedup using GTX cards and possibly a 3-4 times speedup, depending on the application’s ability to take advantage of mixed precision computing and the 3d memory features of Pascal.


*/Resources:
http://www.techradar.com/us/news/comput ... t--1226772

http://www.techradar.com/us/news/comput ... ly-1237060

http://www.techradar.com/us/news/comput ... 2x-1237272

http://www.3dcenter.org/news/nvidias-gp ... -testphase

"confirmed:  NVLink feature for coupling of GPUs with CPUs in supercomputing applications (not active in Gamer variants)"

http://www.3dcenter.org/news/nvidia-pascal
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

Thanks Smicha for update.

Looks sort of like the recent past replayed. Titan Z wasn't quite the performance equal to two Titan Blacks. Moreover, Titan Z initially was ridiculously priced initially. I hope that this initial pricing part isn't replayed. Whatever Nvidia has/will decide(d) to call this new dual GPU monster, hopefully because of improvements in efficiency per watt, it's be closer to the equal of 2xTitan Xs in performance, but, for the near term at least, putting two GPU processors on the same board will create more heat than one GPU processor of the exact same kind so expecting to get an exact doubling of performance is not likely to be realized. When this new monster goes on sale, we won't know for sure how it'll stack up in rendering performance against a comparably priced Pascal (if there is a comparably priced Pascal with comparable performance). There is a risk that in about 6 months, a Pascal will drop that is faster and costs less than this new monster. If this introduction pattern holds for the Pascal, then about this time next year we be faced with the same conundrum.

Also, Nvidia appears to be running out of ending alphabets, but if one considers that "Z" is the last one and "X" precedes "Z" (but skips "Y"), will this new monster be called Titan "Y" as in why would one buy it at this time except for extreme need or status display or will it be called the Titan "W" as in "What the ---- is Nvidia doing?" Then again, it could be intended to drive another hole into AMD's coffin, so why not call it Titan A.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

These are exactly my thoughts :) Naming - was thing about XZ :) Heat - when I di 2xZs on water for Piotrek (build log here viewtopic.php?f=9&t=44831&p=241306&hili ... +Z#p241306) the waterblocks were/are indeed much hotter and temp for gpus are aroun 50C during summer. Their performance is same as for 2 780Ti but on water only.

PS. Putting my hands into something fancy :)
viewtopic.php?f=11&t=50540
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:Tutor, ...
PS. Putting my hands into something fancy :)
viewtopic.php?f=11&t=50540
I liii*****ke it.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
Post Reply

Return to “Off Topic Forum”