Best Practices For Building A Multiple GPU System

Discuss anything you like on this forum.
Post Reply
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

What is the max number of gpus you managed to run on a single licence?
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:Tutor,

What is the max number of gpus you managed to run on a single licence?
Under a single Octane license the maximum number of GPUs that I've tried is 20 on a Supermicro X9DRX. I did the following (under Linux Mint + Octane) before beginning to order the GTX 1070s :

1) 20 GPU processors using 10 GTX 590s;
2) 20 GPU processors using 9 GTX 590s + 1xGTX 690;
3) 20 GPU processors using 12 GTX 780 6Gs + 8 GTX 780 TIs.

Also, using the Linux/X9DRX combo, Linux Mint recognized even more GPUs - 24 (12 GTX 590s). That was the max that I tried to see if the Linux/X9DRX combo could go higher than 20. The total that they might go I suspect is even higher.
From these tests working, I'm assuming that 20 GTX 1070s will not have IO space issues in a Linux/X9DRX combo. However, I've witnessed that different GPUs have different IO space requirements.
Last edited by Tutor on Sun Jan 15, 2017 5:27 pm, edited 1 time in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Great! And what about windows? Have you tried win10 (I am aware you don't like it, being very gentle :))
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

Tho' it seems a bit strange, sometimes we get a great deal for nothing - Exhibit A - Blender and Exhibit B - Linux.

I haven't tried Exhibit? - Windows 10 and do not expect that I'll do so in the future. Trying 14 GPU processors under both/separately Windows 7 & 8 led to nasty crashes on the X9DRX. Sometimes we pay just to have paid or to make the statement, "I'm worthy - because - look I can pay." However, I'm prouder to contribute to/towards Blender and Linux training materials. My other calling card says Mr. Cheapo.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor wrote:Tho' it seems a bit strange, sometimes we get a great deal for nothing - Exhibit A - Blender and Exhibit B - Linux.

I haven't tried Exhibit? - Windows 10 and do not expect that I'll do so in the future. Trying 14 GPU processors under both/separately Windows 7 & 8 led to nasty crashes on the X9DRX. Sometimes we pay just to have paid or to make the statement, "I'm worthy - because - look I can pay." However, I'm prouder to contribute to/towards Blender and Linux training materials. My other calling card says Mr. Cheapo.

Thanks Tutor.

PS. So when will you get your 1070 in 20 pcs?
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

Smicha,
I've got all the pieces that I thought that I'd need as of a few days ago to make a 20 GPU air cooled system. However, fate has thrown a bolder in my pond because in the last few days someone who I admire greatly posted on YouTube the very best 24 series of instructions that I've ever been privileged to receive on how to build a watercooled system properly from start-to-finish. I suspect that you're intimately familiar with him, his vast teaching skills and his irresistable powers of persuasion. So convincing is he that now I'm second guessing my earlier plan. I'm drowning in thoughts about the pros and cons of Air vs. Water - Air vs. Water - Air vs. Water. My earlier goal of completing this system by month's end is sinking. Air vs. Water. Air vs. Water. Air vs. Water. If only he had posted those instructions in October 2016, I'd now be floating on 20 water cooled 1070s. But better late than never. ;) You get the picture: ~~~~~~~~~~~~~~~~~~~~.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor wrote:Smicha,
I've got all the pieces that I thought that I'd need as of a few days ago to make a 20 GPU air cooled system. However, fate has thrown a bolder in my pond because in the last few days someone who I admire greatly posted on YouTube the very best 24 series of instructions that I've ever been privileged to receive on how to build a watercooled system properly from start-to-finish. I suspect that you're intimately familiar with him, his vast teaching skills and his irresistable powers of persuasion. So convincing is he that now I'm second guessing my earlier plan. I'm drowning in thoughts about the pros and cons of Air vs. Water - Air vs. Water - Air vs. Water. My earlier goal of completing this system by month's end is sinking. Air vs. Water. Air vs. Water. Air vs. Water. If only he had posted those instructions in October 2016, I'd now be floating on 20 water cooled 1070s. But better late than never. ;) You get the picture: ~~~~~~~~~~~~~~~~~~~~.
I was telling him to record some instructions last year but he was very reluctant to do it so. Sometimes it's really hard to talk to this guy ;) Anyway there are pros and cons as far ar water is concerned. Definite cons are losing warranty and extra work - both for constructing and replacing gpus, plus some extra costs. But let's assume that one of 20 gpus goes down (or even two of them) - costs of replacing them is around 5% of entire costs.

PS. A month ago I had in my hands 1070 and two of them 170W in octane (gpus only), so I've been thinking about total power draw for 20 of them - 1700-2000W total I guess...? So 2x EVGA 1600W P2 with a 6 to 8 pin extender shall be enough...?
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

So as for the Trenton backplane - the design is:
1. AsrockRack EP2C612 WS with dual E5 v4 xeons, even with 1650v4 or 2620v4. Not sure if I'd go with 2603v4 but I assume it would also work for gpus.
2. Trenton: 14 gpus single slotted, 7 in a row then one pcie slot space for fittings on EK terminal and then another 7 in a row (so 2x EK terminals overall). One pcie slot taken by the link to the host motherboard.
3. 6 gpus on EP2C612 with EK terminal (outlets for one gpu closed with EK blank blocker https://www.ekwb.com/shop/ek-fc-terminal-blank-parallel) and one PCIe port for the link to the slave backplane.

What do you think?
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:... Anyway there are pros and cons as far ar water is concerned. Definite cons are losing warranty and extra work - both for constructing and replacing gpus, plus some extra costs. But let's assume that one of 20 gpus goes down (or even two of them) - costs of replacing them is around 5% of entire costs.

PS. A month ago I had in my hands 1070 and two of them 170W in octane (gpus only), so I've been thinking about total power draw for 20 of them - 1700-2000W total I guess...? So 2x EVGA 1600W P2 with a 6 to 8 pin extender shall be enough...?
I greatly appreciate all of your sage insights.

Like you, I believe that having 3200W of total power will be sufficient.

smicha wrote:Tutor,

So as for the Trenton backplane - the design is:
1. AsrockRack EP2C612 WS with dual E5 v4 xeons, even with 1650v4 or 2620v4. Not sure if I'd go with 2603v4 but I assume it would also work for gpus.
2. Trenton: 14 gpus single slotted, 7 in a row then one pcie slot space for fittings on EK terminal and then another 7 in a row (so 2x EK terminals overall). One pcie slot taken by the link to the host motherboard.
3. 6 gpus on EP2C612 with EK terminal (outlets for one gpu closed with EK blank blocker https://www.ekwb.com/shop/ek-fc-terminal-blank-parallel) and one PCIe port for the link to the slave backplane.

What do you think?
Your allocation of the GPUs makes complete sense. Although I do not have any personal experience with the AsrockRack EP2C612 WS, Asrock has a very good reputation. However, Amfeltec doesn't make a product that would allow that Asrock board to run 20 GPUs. Thus, the capacity to run that many GPUs will fall squarely on the Trenton kit. The only point that causes me any concern is the Trenton board's ability to allow the ASrock to avoid being IO crashed by 20 GPUs. As far as I know, you're the trailblazer on using a Trenton board to help support a 20 GPU system. But I could be wrong. Trenton might already know the probability of success or failure in your reaching your specific goal - in other words, you might not be the trailblazer. I would tell Trenton in writing, and at a minimum, that this build is intended to run 14 GTX 1070 GPUs on the Trenton board, that your intent is to attach the Trenton board to an AsrockRack EP2C612 WS which will itself be running 6 GTX 1070 GPUs and ask them about their wareness whether this type of setup, specifically or only generally, has been tried before, by whom (getting their contact information) and what was their outcome. Since I could be wrong in my belief that you're the trailblazer, you might find that there's one or more dead bodies along the trail or that there are only happy customers who have already done much of what you plan to do and that those customers are willing to share their insights with you. Finally, I suggest that you press Trenton to agree, in writing, on a return and full refund if the Trenton boards does not perform as you intend, especially if earlier you've secured from Trenton assurances that the Trenton board is compatible with the AsrockRack EP2C612 WS and that you should be able to run the 20 GPUs in the layout that you've described.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

IO on Asrock is also my concern. So if I remember well from your posts Supermicro i.e., X9DRX, X10DRX is not IO restricted/designed? So if Supermicro X10DRX is better way to go with I will do so - I may have access to X10DRX easily.
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
Post Reply

Return to “Off Topic Forum”