Best Practices For Building A Multiple GPU System

Thu Apr 30, 2015 2:03 am

I just got some more GTX 780 6Gs. They're $350 (USD) EVGA refurbs from NewEgg. I'll keep you posted on how they work out in my Supermicro SuperServer 8047R-7RFT+ [ http://www.supermicro.com/products/syst ... -7rft_.cfm ], using the Amfeltec Splitters. In the pic, below, 8x GTX 780 6Gs are ganged together and sitting on the Amfeltec PCIe slots that will be connected to the splitter by 12" flex cables. I'm going to modify my chassis door to allow over half of the GPUs to sit above a hole in the back of the chassis to be within 12'' of the PCIe-based splitter card. That'll require my using Mr. Dremel tool to cut the door to allow easy access to the splitter card. The two metal rods that I've run through the holes in the PCIe slot receptacles connected to the base of the GPUs, help to stabilize the GPUs, keep them well spaced for better cooling and will act as a support for the GPUs. Obviously, I've abandoned for now the Windows chassis concept that I posted about, above, in my earlier Waste Not, Want Not post. Instead, I'll be using just what I had conceived as the MacPro design: an "L" shaped rod cut from that ancient chassis. So, in the end, that chassis is still not being wasted because each of the GPUs is attached to that "L" shaped part of that chassis, by screws to also help stabilize the GPUs, to keep them well spaced, and to provide a means for me to better transport and move the GPUs. I opted to give the inner GPUs are little larger spacing to help avoid them being the warmest. I've always preferred having this case positioned on it's side (atop a wooden roller) as shown in the pic. Also, the pic makes it obvious that the ganged 8 GPUs module is taller than the case and the inner depth/height of the case is too small to allow me to install the module totally within the case. But not to worry. I always think outside of the box, and in this case (pun intended), I'm thinking about 98% (by height) out of the box and about 25% by GPU card count not even over the box because the system's four PCIe slots are at the bottom- back of the case (if the case was stood upright. [See pic shown by clicking Supermicro's URL]. So to get the GPUs as close as possible to the splitter cards, I've decided to allow the GPU module to sit with only six of the GPUs over the case and ever so slightly within it, and two GPUs not even one iota over the case. That way the sic cards farthest to the right in the pic will be within 12'' of the splitter card and so will the 2 cards farthest to the left in the pic. I know there's a bit of asymmetry letting two GPUs hang out to dry in the wind, while six get to stay closer to the case, but I'm using my leverage to keep more weight over the case. Moreover, the two splitter cards cards work best in the system's two PCIe slots that are fartest from the floor (if the case was stood upright). See, there's a method to my madness - centering the GPU module optimally over the splitter cards and preventing tipping. So the sooner the seller gets to me those molex to floppy drive cables needed to power the Amfeltec PCIe slots/cards (directly connected to the GPUs), the sooner I can get this party started.

Thu Apr 30, 2015 2:18 am

smicha wrote:My opinion on multi GPU rendering: shame on OTOY that a single licence cannot handle e.g., 16 GPUs over-the-network rendering (16 GPUs is theoretical limit on an 8-slot mobo for 8x dual gpu cards).

I agree. It's a real pity.
Seeker

Thu Apr 30, 2015 2:33 am

Hey Tutor,
Some awesome intel there - thanks for sharing! It feels like we're finally starting to define the Octane GPU boundaries.

We should be able able to add a bit of our own experience on the I/O side of things once we've completed the first phase of testing on the GPU Turbine - some exciting stuff coming down the line.

Best,
Seeker

Thu Apr 30, 2015 2:39 am

Seekerfinder wrote:
smicha wrote:My opinion on multi GPU rendering: shame on OTOY that a single licence cannot handle e.g., 16 GPUs over-the-network rendering (16 GPUs is theoretical limit on an 8-slot mobo for 8x dual gpu cards).
I agree. It's a real pity.
Seeker

I agree totally. Moreover, I am still hopefully that in the very near future, I'll be adding a gang of four GPUs to the eight shown above in my last post, totaling exhausting that 12 GPU per system limit. I'd also cast my vote for OTOY allowing users to employ network render using all Octane licenses purchased by the user times 16 GPUs per license. That way, I could employ almost all of my GPUs for network rendering.

Wed May 20, 2015 2:10 pm

Is there a place in Windows (I am using Windows 7, 64bit) that gives you a numerical figure as to how much IO space has been used, what is remaining, etc. As in a quantifiable measurable schematic which is not inferred.

Wed May 20, 2015 8:00 pm

Thanks Tutor for pointing me to IO space and registry edition.
My 8 GPUs work finally on octane.
http://render.otoy.com/forum/viewtopic. ... 80#p235480

Fri May 22, 2015 6:40 pm

itou31 wrote:Thanks Tutor for pointing me to IO space and registry edition.
My 8 GPUs work finally on octane.
http://render.otoy.com/forum/viewtopic. ... 80#p235480

Glad my earlier posts helped. So, what's next - ten GPU processors, by adding a Titan Z [ see bottom of page 4 here { http://render.otoy.com/forum/viewtopic. ... 9&start=30 } for a little more info that may be of help] ?

Sat May 23, 2015 4:30 pm

Tutor wrote:I'll keep you posted on how they work out in my Supermicro SuperServer 8047R-7RFT+

Was that board chosen for any specific reason? anyOne knows stripped server single CPU board that would be able to handle more PCIe devices?

Sun May 24, 2015 4:50 am

glimpse wrote:
Tutor wrote:I'll keep you posted on how they work out in my Supermicro SuperServer 8047R-7RFT+
Was that board chosen for any specific reason? anyOne knows stripped server single CPU board that would be able to handle more PCIe devices?

I chose to buy two of those systems because each of them supports 4xE5-4650 CPUs (each 8-core, plus 8 hyper-threads). I was able to get 13 ES versions of those CPUs on E-bay for at or under $500 each. The ES versions are faster than the final released versions and each of the ones that I purchased has the exact specs of the E5-2680. Also, my favorite 3d app - Cinema 4d - takes full advantage of all of those 64 threads. Until the Xeon E7-4890 V2s were released, my systems were the top scoring Cinebench systems - each scoring about 3791 points [ see, e.g., http://cbscores.com ]. Given, the good CPU rendering ability of the CPUs on those motherboards and the fact that I can simultaneously render on the CPUs and GPUs, I have no reservations with having made those purchases. I'm in the process of trying to max their GPU rendering ability by using the Amfeltec splitters , along with risers that I should soon be receiving.

Generally, Supermicro motherboards have rock solid bios implementations and, thus, generally have better IO space implementations to support more PCIe devices. That's also why Amfeltec suggests Supermicro motherboards for GPU centric systems.*/ I suggest that you contact Supermicro to get a better idea of which particular single CPU motherboard would be best. Here's their contact info.

Mail
Worldwide Headquarters
Super Micro Computer, Inc.
980 Rock Avenue
San Jose, CA 95131

Telephone
Tel: +1-408-503-8000
Fax: +1-408-503-8008

E-mail
General Info: [email protected]
Tech Support: [email protected]

*/ " The motherboard limitation is for all general purpose motherboards. Some vendors like ASUS supports maximum 7 GPUs, some can support 8. All GPUs requesting IO space in the limited low 640K RAM. The motherboard BIOS allocated IO space first for the on motherboard peripheral and then the space that left can be allocated for GPUs. To be able support 7-8 GPUs on the general purpose motherboard sometimes requested disable extra peripherals to free up more IO space for GPUs. The server type motherboards like Super Micro (for example X9DRX+-F) can support 12-13 GPUs in dual CPU configuration. It is possible because Super Micro use on motherboard peripheral that doesn’t request IO space."

P.S. There is a feature of my Supermicros that, although I didn't know about it when I made my purchases, makes working with my Supermicros a more positive experience. A particular system can have what I would classify as an excellent bios implementation; but it still will not allow for more that a set number of GPUs to be operated without resource allocations issues; however, the system will still boot. Some of the hardware needing those unavailable resources will not however function; but an excellent bios implementation will make wise choices about how to allocate precious resources so that the system will still boot. This is a feature of my Supermicros that I now really appreciate. What I would classify as a bad bios implementation, in addition to not using contiguous space wisely, may cause the system to not boot at all if too many conflicting resource requests are present. In other words, the bios implementation that I'd prefer is one which allows the system to boot even when too many conflicting resource requests are made, so that I can view the device's information in the Device Manager and gather more precise information about the nature of the IO space conflict so that I can at least intelligently try to resolve the conflict.

Sun May 24, 2015 8:06 am

Tutor,

Extremely useful info - thank you so much!