Best Practices For Building A Multiple GPU System

Discuss anything you like on this forum.
Post Reply
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

rappet wrote:@Tutor, this is so Cool man!
(obvious repsonse, but I could not resist :P )
But really... respect!!! And very curious where this will end.
greetings and all the best for 2015 and beyond
Rappet,
I wish you and your immediate family the very best for 2015 and beyond. That also applies to your extended family and that includes billions.

Resistance is futile - I want it to be "Ice Cold Octane."

For me this kind of modding and experimentation has no end. Moreover, I'm just continuing what someone else started and using today's technology to assist. I hope that I've helped to inspired other member(s) of our extended family to take it and run with it even further during my temporary occupation of my shell and, even after my shell has become, once again, one with the earth.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

1) The 8 GPU processor brick wall: My Tyan Server has 8 double wide GPU slots. Until recently, I had hoped that I could load the Tyan with 8 double wide GPU cards and have it recognize all of the GPUs. My goal was to use Titan Z cards to exceed 8 GPU processors and to have 14 GPU processors in one system. Having tried bios mods, Regedit hacks and various card stack configurations, the hardware will not recognize more than 8 GPU processors. So having multiple GPUs on one card does not get around the 8 GPU limit because it's processor based. I can put either a maximum of 8 single GPU processor GTX cards ( such as 780 Tis, 780 6G, original Titans, Titan Blacks, etc.) in it, or 4 dual processor GTX cards (such as 590s, 690s or Titan Zs), or a combination of single and dual processors cards in it so long as I do not exceed the 8 GPU processor limit. All of the other processors above 8 are wasted there. Because of that discovery/realization, I've decided to put 2 Titan Zs (4 GPU processors total), plus 1 Titan Black (1 GPU processor), plus 3 Titans (3 GPU processors total) (4+1+3=8) in my Tyan Server. So, I assess my accomplishment of my goal here as a major failure.

2) Mr. Freezer begging for attention: While so far devoting all of my time for this project to the 8 GPU processor limit, I haven't had the time to find the optimal settings for Mr. Freezer. Disappointingly, I haven't seen temps for coolant exiting Mr. Freezer below 20 degrees centigrade. Usually when I've noticed the coolant temps exiting Mr. Freezer and entering my reservoir, the temps are around 21-23 degrees centigrade. The system was then recognizing only 8 of the 14 GPU processors, but for rendering Octane benchmarks and other render jobs the processors' temperatures (read by GPU-Z ) never exceeded 35 degrees Centigrade and averaged 33 degrees centigrade. At idle the temperatures of the GPUs processors were the within 1-2 degrees centigrade of the coolant temperature as it exited Mr. Freezer (so the GPUs temps ranged from about 22 to 25 degrees centigrade). I was and am shooting for an average of around 5 degrees centigrade at idle and an average of around 10 degrees centigrade under load. So, I assess my accomplishment of my goal here as, currently, trending toward major failure with a tiny semblance of promise.

3) Flopping about: GPU-Z readings of single precision floating point peak performance are (A) slightly in excess of 4,500 GFLOPS for each of my old Titan SCs and (B) slightly in excess of 5,000 GFLOPS for my Titan Black and and for each GPU of my Titan Zs. I've yet to over clock any on them for these tests. So after reconfiguring my GPU cards in my Tyan server as I've indicated above, I fully intend to achieve a single system floating point peak of 38,500 (5x5,000 + 3x4,500) GFLOPS for the Tyan system before I even overclock any of the the GPUs. Moreover, 38,500 GFLOPS equals 38.5 PFLOPS (petaflops).*/ However, my goal was to achieve a single system single precision floating point peak performance of at least 56 PFLOPS. So, I assess my accomplishment of my goal here as, currently, strongly trending towards complete failure, with the degree of completeness of failure dependent significantly on how no. 2, above, shakes out since my ability to safely over clock the GPUs is affected by their temperatures.

4) Pulling it apart and configuring, but this time not from scratch - so it's reconfiguring: Luckily, I had the foresight to buy and spread quick disconnects throughout this mod, buy two dual pumps (one as a replacement - luckily too I don't need it yet) and buy lots of tubing and non-conductive coolant. So, I'll be reconfiguring the Tyan as set forth above. I'll be installing the other four Titan Zs (40 of the PFLOPS that I had intended to all be in one system) in one of my SilverStone cases, on a 4xPCIe double wide slotted GUP4 motherboard. Both of these systems will be water cooled by the same cooling system. I am glad that I, at least, had a plan in place for potential failure(s). Life goes on, regardless.


*/I. On June 10, 2013, China's Tianhe-2 was ranked the world's fastest single computer with a record of 33.86 petaflops. [ http://en.wikipedia.org/wiki/FLOPS#Sing ... er_records ]
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
glimpse
Licensed Customer
Posts: 3740
Joined: Wed Jan 26, 2011 2:17 pm
Contact:

Tutor, a bit sad to hear Your findings, but as You say: That's life =)

I was worried about this multi GPU barrier, 'cos seen some Guys on mining community that couldn't get over it so easily. Then again some "science" related Guys (like those building FASTRA II) managed to come up with solution & I've seen some companies providing custom builds too (VDAC from RenderStream) & all of these are old news, so technically there are solution, but they need quite some time & [probably some investment..

Trenton systems also sell some crazy 20 slot backplanes - I would be surprized if they would not have a solution to utilise those =) but I guess they are not interested to share a solution if You're not selling their product =DDD

so, all in all don't let down Your arms, 'cos probably You'll be able to find a solution, it's just not so easy to get in sight =)

as for cooling..well, realistically it's getting harder & harder to improve cooling (at some point it's just not so economical) - You can't beat physics =) but again, Your temps seems fine & I'd be happy with them for sure. Adjust Your goal from getting too low with temps (just for the sake of numbers) on getting to cool everything silently & You'll have a win-win situation, where water & GPU temps will be way lower (probably 30-50C) than what they use to be on air (80+) & You'll still enjoy additional performance.

It's not the race, but if You still interested into getting lower then ambient temps - look into stuff for sea aquariums : those guys use aquarium chillers. As mentioned..that might not be practical, but might be a solution =)
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

glimpse wrote:Tutor, a bit sad to hear Your findings, but as You say: That's life =)

I was worried about this multi GPU barrier, 'cos seen some Guys on mining community that couldn't get over it so easily. Then again some "science" related Guys (like those building FASTRA II) managed to come up with solution & I've seen some companies providing custom builds too (VDAC from RenderStream) & all of these are old news, so technically there are solution, but they need quite some time & [probably some investment..

Trenton systems also sell some crazy 20 slot backplanes - I would be surprized if they would not have a solution to utilise those =) but I guess they are not interested to share a solution if You're not selling their product =DDD

so, all in all don't let down Your arms, 'cos probably You'll be able to find a solution, it's just not so easy to get in sight =)

as for cooling..well, realistically it's getting harder & harder to improve cooling (at some point it's just not so economical) - You can't beat physics =) but again, Your temps seems fine & I'd be happy with them for sure. Adjust Your goal from getting too low with temps (just for the sake of numbers) on getting to cool everything silently & You'll have a win-win situation, where water & GPU temps will be way lower (probably 30-50C) than what they use to be on air (80+) & You'll still enjoy additional performance.

It's not the race, but if You still interested into getting lower then ambient temps - look into stuff for sea aquariums : those guys use aquarium chillers. As mentioned..that might not be practical, but might be a solution =)

Thanks Glimpse for all of your suggestions.

BTW - That 8 GPU system shown in VDAC Renderstream's old announcement is a Tyan system. In fact, given the timing of that announcement in 2010, it's exactly the same system as mine. That's because VDAC's old announcement is what motivated me to buy the Tyan in the first place. Trenton systems' BPG8032 PCI Express Backplane, in particular the 17-x16 PCIe 2.0/1.1 Interface, is the based on the same model as the 8 GPU Tyan, with the exception being that Tyan (probably realizing that most GPUs were then being sold as dual slot solutions and would likely be so in the then near future) cut the number of slots almost in half - The Tyan really has 10+ slots: 8 PCIe 2 x16 double spaced slots, 2 PCIe 2 x16 sized but x4 in speed slots (one toward each side of the system and single spaced from the nearest x16 slot) and one PCI slot for one's relics. I know now that it's more to the problem than just the no. of empty slots (because I have the slots to seat 14 GPU processors so long as 6 of them are Titan Zs). Amfeltec's caution*/ only serves to reinforce the point that even if you have enough slots to seat all of the GPUs that you desire [ (a) 14 GPU processors on 8 GPU cards, in my case or (b) a 16 or 18 single wide slotted Trention chassis or (c) an Amfeltec 4 Clusters addition w/the appropriate 4 channel Host board or (d) a Cubix and (e) (in any of these cases) Otoy's guarantee that there is nothing in Octane's code base to limit the number of cards that can be deployed on a single system], a system bios can still crash and trash the whole party, for it's how much IO space the the main motherboard has and how that IO space is handled that is the first step to any success.



*/ " The motherboard limitation is for all general purpose motherboards. Some vendors like ASUS supports maximum 7 GPUs, some can support 8. All GPUs requesting IO space in the limited low 640K RAM. The motherboard BIOS allocated IO space first for the on motherboard peripheral and then the space that left can be allocated for GPUs. To be able support 7-8 GPUs on the general purpose motherboard sometimes requested disable extra peripherals to free up more IO space for GPUs. The server type motherboards like Super Micro (for example X9DRX+-F) can support 12-13 GPUs in dual CPU configuration. It is possible because Super Micro use on motherboard peripheral that doesn’t request IO space."
Last edited by Tutor on Tue Jan 20, 2015 9:37 am, edited 2 times in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
glimpse
Licensed Customer
Posts: 3740
Joined: Wed Jan 26, 2011 2:17 pm
Contact:

Tutor wrote:
BTW - That 8 GPU system shown in VDAC Renderstream's old announcement is a Tyan system. In fact, given the timing of that announcement in 2010, it's exactly the same system as mine.
exactly, it's old "news" - that's why I said if they managed to fill this using 16GPUs, I believe few years after there should be easier solution =) or at least a documented way how to.. =) just keep digging, 'cos it's not the end of the road =) but.. if You start to weight press & cons, maybe it's not that economical in the end (thus building two systems might be better idea =)

anyway, wanted so bad for this, Your experiment to make happen =) hopefully You'll find a spark to continue!
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

glimpse wrote:
Tutor wrote:
BTW - That 8 GPU system shown in VDAC Renderstream's old announcement is a Tyan system. In fact, given the timing of that announcement in 2010, it's exactly the same system as mine.
exactly, it's old "news" - that's why I said if they managed to fill this using 16GPUs, I believe few years after there should be easier solution =) or at least a documented way how to.. =) just keep digging, 'cos it's not the end of the road =) but.. if You start to weight press & cons, maybe it's not that economical in the end (thus building two systems might be better idea =)

anyway, wanted so bad for this, Your experiment to make happen =) hopefully You'll find a spark to continue!
I was further researching the issue as you were posting and I later edited my post w/o refreshing the page to see that you had responded to my post before my final edits. According I hope that you will indulge my aged brain by reading my last post as finally edited. In sum, the way I view the problem is that if the motherboard's bios is such that the additional GPUs aren't allocated enough IO space, then it matter's not whose added chassis you select because the first problem that has to be solved is for the main motherboard to be aware that there are X no. of GPU processors present and to have pertinent information about those GPUs and their needs. That's why Amfeltec gave me that caution about IO space and pointed out how Supermicro has dual CPU motherboards designed to avoid that problem at least up to 12-13 GPUs. But as you and I've seen with the case of Tommes' issue [ http://render.otoy.com/forum/viewtopic.php?f=23&t=44209 ], that is just the start, but certainly not the end, of the issues involved in building a multiple GPU system with >7-8 GPUs. I was trying to use 14 GPU processors on an old dual CPU Tyan Server. I don't doubt (and in fact I hope) that future motherboards allocate sufficient IO space/resources to handle lots more GPUs being present. I'm just stumped as to what I can do in the meantime to get around this first stage issue. I do have two 32-core 4 CPU SuperMIcros, but they have internally only 4 single wide (so its 2 double wide x16 V3) PCIe slots on the main motherboards. Amfeltecs???

Curiously, Amfeltec's caution offers me no basis to buy their 4 chassis system for up to 16 GPUs, unless it's just for a little future proofing my purchase.
Last edited by Tutor on Tue Jan 20, 2015 10:24 am, edited 1 time in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
glimpse
Licensed Customer
Posts: 3740
Joined: Wed Jan 26, 2011 2:17 pm
Contact:

Tutor wrote: In sum, the way I view the problem is that if the motherboard's bios is such that the additional GPUs aren't allocated enough IO space, then it matter's not whose added chassis you select because the first problem that has to be solved is for the main motherboard to be aware that there are X no. of GPU processors present and to have pertinent information about those GPUs and their needs. That's why Amfeltec gave me that caution about IO space and pointed out how Supermicro has dual CPU motherboards designed to avoid that problem at least up to 12-13 GPUs. But as you and I've seen with the case of Tommes' issue [ http://render.otoy.com/forum/viewtopic.php?f=23&t=44209 ], that is just the start, but certainly not the end, of the issues involved in building a multiple GPU system with >7-8 GPUs. I was trying to use 14 GPU processors on an old dual CPU Tyan Server. I don't doubt (and in fact I hope) that future motherboards allocate sufficient IO space/resources to handle lots more GPUs being present. I'm just stumped as to what I can do in the meantime to get around this first stage issue.
it's a tricky topic for sure..but as You've seen from some links: like VDAC by RenderStream - where they managed to get 116GPUs working, then that video (on fastra II) where ASUS come with specific bios (curious if something could be obtained by mass audience - really doubt this) for Single CPU motherboard..- all of this proves there's a solution, though it might be far from being simple =)
I do have two 32-core 4 CPU SuperMIcros, but they have internally only 4 single wide (so its 2 double wide x16 V3) PCIe slots on the main motherboards. Amfeltecs???
would be interesting to see. but keep an eye amfeltec's GPU cluster has a weak uplink (host board operates at 1x?) - though they have other product - splitter card (that might be cheaper & allows You to split 4x in four slots with mechanical, 16x), check this out
Last edited by glimpse on Tue Jan 20, 2015 10:32 am, edited 1 time in total.
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

glimpse wrote:
Tutor wrote: In sum, the way I view the problem is that if the motherboard's bios is such that the additional GPUs aren't allocated enough IO space, then it matter's not whose added chassis you select because the first problem that has to be solved is for the main motherboard to be aware that there are X no. of GPU processors present and to have pertinent information about those GPUs and their needs. That's why Amfeltec gave me that caution about IO space and pointed out how Supermicro has dual CPU motherboards designed to avoid that problem at least up to 12-13 GPUs. But as you and I've seen with the case of Tommes' issue [ http://render.otoy.com/forum/viewtopic.php?f=23&t=44209 ], that is just the start, but certainly not the end, of the issues involved in building a multiple GPU system with >7-8 GPUs. I was trying to use 14 GPU processors on an old dual CPU Tyan Server. I don't doubt (and in fact I hope) that future motherboards allocate sufficient IO space/resources to handle lots more GPUs being present. I'm just stumped as to what I can do in the meantime to get around this first stage issue.
it's a tricky topic for sure..but as You've seen from some links: like VDAC by RenderStream - where they managed to get 116GPUs working, then that video (on fastra II) where ASUS come with specific bios (curious if something could be obtained by mass audience - really doubt this) for Single CPU motherboard..- all of this proves there's a solution, though it might be far from being simple =)
I couldn't find where it says that they got 116 GPUs working. Where is that? I could find where they said a year later that they had 8 GPUs working with Octane [ http://www.renderstream.com/everything/ ... u-vdactr8/ ].

I do have two 32-core 4 CPU SuperMicros, but they have internally only 4 single wide (so its 2 double wide x16 V3) PCIe slots on the main motherboards. Amfeltecs???
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
glimpse
Licensed Customer
Posts: 3740
Joined: Wed Jan 26, 2011 2:17 pm
Contact:

Tutor wrote: I couldn't find where it says that they got 116 GPUs working. Where is that? I could find where they said a year later that they had 8 GPUs working with Octane [ http://www.renderstream.com/everything/ ... u-vdactr8/ ].
"At RenderStream we are providing solutions with two to eight GPU’s, and we just introduced a sixteen GPU development system (Linux only). Our solutions are made using Nvidia Geforce, Quadro and Tesla cards. For even more speed we announced at the March 2010 ACM Siggraph meeting in Austin our VDACTr8 with eight single GPU cards and VDACTr8x2 solution using eight dual GPU GTX295 cards." (source)
Tutor wrote: I do have two 32-core 4 CPU SuperMicros, but they have internally only 4 single wide (so its 2 double wide x16 V3) PCIe slots on the main motherboards. Amfeltecs???
check the post before - gave a link in the end about alternative solution (sort of =)
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

glimpse wrote:
Tutor wrote: I couldn't find where it says that they got 116 GPUs working. Where is that? I could find where they said a year later that they had 8 GPUs working with Octane [ http://www.renderstream.com/everything/ ... u-vdactr8/ ].
"At RenderStream we are providing solutions with two to eight GPU’s, and we just introduced a sixteen GPU development system (Linux only). Our solutions are made using Nvidia Geforce, Quadro and Tesla cards. For even more speed we announced at the March 2010 ACM Siggraph meeting in Austin our VDACTr8 with eight single GPU cards and VDACTr8x2 solution using eight dual GPU GTX295 cards." (source)
Tutor wrote: I do have two 32-core 4 CPU SuperMicros, but they have internally only 4 single wide (so its 2 double wide x16 V3) PCIe slots on the main motherboards. Amfeltecs???
check the post before - gave a link in the end about alternative solution (sort of =)
I went to RenderStreams' site and found this outrageously priced system (using my Tyan) from a link where they said one can purchase a system with 2-20 GPUs [ http://www.renderstream.com/products/hi ... computing/ ]:

MSRP (USD) $16,495

CPU Intel X5650

CPU per Computer 2

Cores per CPU 6

Pyhsical cores per Chassis 12

Logical Cores per Chassis 24

CPU Speed (GHz) 2.66

Aggregate Speed = GHz * Cores per Chassis 31.92

GPU 8X GTX 680

ECC Memory 24 GB

Hard Drive 4TB

OS (Windows or Linux) Included

3 Year Depot Warranty 3Included

Shipping (USA) Included

That's more than I spent to better outfit my Tyan to higher specs (i.e., I have lots more memory, storage and faster CPUs) and for 6 Titan Zs and the liquid cooling system. Moreover, from reading through Renderstream's site, I'm beginning to seriously doubt that Renderstream could pull off running 16 GPUs, not to mention 116. Their statement's entice, but what they actually offer is less than what I had 2 years ago for a lot less money.
Last edited by Tutor on Tue Jan 20, 2015 10:59 am, edited 3 times in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
Post Reply

Return to “Off Topic Forum”