Best Practices For Building A Multiple GPU System

Discuss anything you like on this forum.
Post Reply
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

For more information on external GPUs and GPU card selection for building a multiple GPU system, particularly if you are an animator, see http://render.otoy.com/forum/viewtopic. ... 40#p236240 .
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

I think I missed it - have you managed to get 12 gpus working on a single motherboard?
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:Tutor,

I think I missed it - have you managed to get 12 gpus working on a single motherboard?
Not yet; still awaiting receipt of the riser cables. I have, however, put five of my GTX 780 TI's in one of my Tyan GPU Servers under water; but am awaiting receipt of more parts to do the same to the rest of them. My patience is being tested by ordering parts across the great ponds.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

I am just about to assemble another workstation on Titans X (aircooled this time, but ready for easy expansion to 4x Titans X on water) and did some research about pcie lanes. Here are my thoughts, so could you reply what you think, if my reasoning makes any sense.

Notiusweb reached 10 gpu limit. I decided to go with i7 5930k due to its 40 pcie lanes (instead of 5820k, 28 pcie lanes). So I just thought that 40 pci lanes has something common with 10 gpus - 4 lanes per every gpu. That is why on any 28 pcie lanes cpu the limit would be 7 gpus - 7gpus*4lanes = 28 lanes. So dual xeon gives 80 lanes overall and theoretically 20 gpus. Does it make any sense?

Another question is about PLX chips - in Asus x99e-ws, Asrock ws-e or extreme 11 we have dual plx chips. So in a 4-gpu layout we may have 4x x16 bandwidth. I am wondering if there is any connection between number of plx chips and pcie lanes and max gpu numbers. Here is a very interesting article

http://www2.electronicproducts.com/The_ ... -html.aspx
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:Tutor,

I am just about to assemble another workstation on Titans X (aircooled this time, but ready for easy expansion to 4x Titans X on water) and did some research about pcie lanes. Here are my thoughts, so could you reply what you think, if my reasoning makes any sense.

Notiusweb reached 10 gpu limit. I decided to go with i7 5930k due to its 40 pcie lanes (instead of 5820k, 28 pcie lanes). So I just thought that 40 pci lanes has something common with 10 gpus - 4 lanes per every gpu. That is why on any 28 pcie lanes cpu the limit would be 7 gpus - 7gpus*4lanes = 28 lanes. So dual xeon gives 80 lanes overall and theoretically 20 gpus. Does it make any sense?
Smicha,

I believe that going with the i7 5930k, instead of the 5820k, is an excellent choice. Having more available PCIe lanes is a good idea. The PCI Express link can consist of from one to 32 lanes between two devices. The more lanes, the better, i.e., faster, the system can satisfy high throughput needs of performance-critical applications like 3d [ http://en.wikipedia.org/wiki/PCI_Express#Lane ]. Just remember that given sheer number of functions tied to the PCIe bus precludes assuming that the total number of PCIe lanes will be allocable among just the GPUs. Here are the two articles that I've found to be of the greatest benefit lately for understanding IO space issues:

1) http://resources.infosecinstitute.com/s ... d-systems/ and
2) http://resources.infosecinstitute.com/s ... d-systems/ .

They should be studied in that order.
smicha wrote:Another question is about PLX chips - in Asus x99e-ws, Asrock ws-e or extreme 11 we have dual plx chips. So in a 4-gpu layout we may have 4x x16 bandwidth. I am wondering if there is any connection between number of plx chips and pcie lanes and max gpu numbers. Here is a very interesting article

http://www2.electronicproducts.com/The_ ... -html.aspx
Thanks for the PCIe article that you referenced.

There is definitely a relationship between the use and number of PLX chips [ http://www.plxtech.com/products/expresslane/switches ] and max GPUs accommodated. There's also a definite relationship between available PCIe lanes and system operation performance. However, I do not believe that simply selecting a CPU that provides more available lanes means that the system will be able to successfully accommodate more GPUs. My understanding is that selecting a CPU that provides more available lanes means that if the system accommodates more GPUs that will make for a faster GPU experience in terms of moving data to and from the GPUs.

Reading about the Asrock Extreme 11 reminds me of reading about my Tyan GPU Servers (slightly earlier vintage - but dual 1366s). My Tyans are beneficiaries of PLX chips - 4 of them per system. Were it not for those PLX chips, those Tyans might likely have only been marketed as only four double-wide PCIe systems, rather than eight double wide PCIe systems. So try to find a system with the greatest number of PLX chips and/or PCIe x16 slots. And be willing to make choices (after reading the manual and particularly studying the usual graphic display of the system's layout (particularly that of the PCIe layout)) to determine what else consumes PCIe resources, but that you can disable in bios and not miss because it's not a priority.

Reading about the Asrock Extreme 11 also reminds me of reading about my EVGA SR-2s. The Asrock Extreme 11 has 7x PCIe 3.0 x16 slots (2 x PLX PEX 8747 bridges) and supports 4-Way SLI/CrossFireX™ in full x16 PCIe 3.0 speed. My EVGA SR-2s (slightly earlier vintage - but dual 1366s) have 7 xPCIe 2.0 x16 slots and were marketed as 4-Way SLI ( http://www.evga.com/articles/00537/ ). "These seven PCI-E x16/x8 slots are reserved for graphics cards, and x1/x4 devices. " [ http://www.evga.com/support/manuals/fil ... E-W888.pdf. ] When I purchased my EVGA SR-2s in 2010, I had no idea that someday I would be trying to take advantage of their ability to seat at least 7 modern GPU cards (with riser cables to get around the fact that the seven slots are single-wide GPU spaced). I purchased them when my aim was limited to clock tweaking CPUs. My ultimate point is that it is probably better to seek out the motherboard with the maximum number of GPU slots because that is probably the easiest way to determine whether the motherboard will accommodate closest to the maximum number of GPUs one may have in mind. Another way is by word of mouth/hands from mouths/hands of those who have like aims and interests. The down-side of this approach, at least for now, is that there's usually a gap between the time that one with like aims and interests makes an initial purchase and later finds out (and hopefully publishes) the full potential of a motherboard that they purchased in the past - take for example the seven slotted wonder that can be coaxed into being a ten slotted wonder - like Notiusweb has done and some owners who've purchased the 7 slotted EVGA SR-2 as far back as 2010 are now finding out that those old work horses can support more than 7 GPUs with riser cards/cables and splitters. Another approach is what I'm taking with my Supermicros. Those systems each have 4 PCIe slots (x16, x8, x16, x8 - single-wide spaced). Thus, this approach could involve, as it has for me, studying arcane articles just like the ones you and I have referenced in this and your last post, to take further steps to harness space for the IO needs of a far greater number of GPU cards, than one might, at first blush, readily associate with that system being capable of sustaining. If anyone else has other approaches, I'm all ears/eyes.
Last edited by Tutor on Sat Jun 06, 2015 8:20 am, edited 3 times in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Tutor,

Thank you very much for your reply.

I am waiting for news from you about exceeding the 10-gpu-limit - you'll succeed.
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
Tutor
Licensed Customer
Posts: 531
Joined: Tue Nov 20, 2012 2:57 pm
Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute

smicha wrote:Tutor,

Thank you very much for your reply.

I am waiting for news from you about exceeding the 10-gpu-limit - you'll succeed.
Think about GPUs outside of the box or have a big box.

I've been searching the forests in my mind and not steadily watching the trail. So the obvious has evaded me. If it was a snake, it would have bitten me, given its close proximity.

Background:
I dumped my second Tyan Server and used a small part of the proceeds to bid on Ebay for a third EVGA SR-2 motherboard. Luckily, I lost that bid to the winner who bid $811.00 (USD), which was too much for my liking to pay for a used partially complete old system. I was lucky because losing that bid make me think deeper about the matter at hand. Remember in my last post where I suggested that it's probably easier to find a motherboard that suits one's GPU goals by simply looking at the number of available PCIe slots. Well, I finally took my own advice and was led back to a seemingly dead horse which I saw along the trail a few months ago. I'm going to beat that seemingly dead horse again.

Revelation:
The more and more I study the issue of what motherboard is best to reach the 12 GPU per system maximum allowed by the Octane license, the more and more firmly that I believe that the Supermicro X9DRX+-F motherboard [ http://www.supermicro.com/products/moth ... DRX_-F.cfm ] (recommended by Amfeltec */ ) is the best motherboard currently on the market for achieving that goal. That motherboard can be purchased in the U.S. from, e.g., Superbiiz [ http://www.superbiiz.com/detail.php?name=MB-X9DRXF# ] for $489 (USD). It's 10x PCI-Express 3.0 x8 Slots , plus 1x PCI-Express 2.0 x8 Slot (runs at x4) [and is thus an eleven slotted workhorse] is the main reason why I decided to visit that seeming dead horse again. I firmly believe, despite Tommes' lack of success in getting Octane to see all 10 of his GTX 780 Tis, that the horse isn't really dead. Significantly, Tommes' pic of his system's Device Manager shows that his OS recognized all ten of his GPUs.

I wouldn't recommend purchasing this motherboard in a case combo however because the power supplies that come in such packages are much too weak when 10 of its slots are occupied by the necessary x8 to x16 PCIe extender/riser cables and the 11th one (I'd recommend using one of the X8 slots) is occupied by an Amfeltec splitter card using only two of its four cables. I'd feed that horse from a 1600 watt PSU.**/ Using the Supermicro X9DRX+-F in this manner will require thinking outside of the box, unless you get (or have) a very big box for all of those GPUs and about three 1600 watt power supplies for the GPUs.

Conclusion:
If, like me, you're shooting for that 12 GPU limit, ditch the thought of getting an i7 because you'll need dual E5-2600s or dual E5-2600 v2s.

*/ "The server type motherboards like Super Micro (for example X9DRX+-F) can support 12-13 GPUs in dual CPU configuration. It is possible because Super Micro use on motherboard peripheral that doesn’t request IO space." That turns out to be an overly cautious underestimate - see P.S., below.

**/"Warning: To avoid damaging your motherboard and components, please use a power supply that supports a 24-pin, two 8-pin and one 4-pin power connectors. Be sure to connect the 24-pin and the 8-pin power connectors to your power supply for adequate power delivery to your system. The 4-pin power connector is optional; however, Supermicro recommends that this connector also be plugged in for optimal power delivery."

P.S. - More confirmation - "often hear people say there is a 16-GPU limit for NVIDIA video cards, is it true? I searched google a bit but looks like no one has really made a test for it. Now, after successfully building a rig with more than 16 GPUs by myself, I can tell you this rumor is not true, the so-called 16-GPU limit (for NVIDIA cards) doesn't exist at all. The rig I built is a GPU monster with 11 NVIDIA cards (4x GTX660 Ti, 5x GTX295, and 2x 9800 GX2). You see, I used some old cards to save money, and 7 of those are dual-GPU cards, so the total GPU number is 18. The motherboard's model is Supermicro X9DRX+-F and it has 11 pci-e slots, but all of them are x8 slots. With similar method as FASTRA II used, some pci-e extenders are employed to make it possible to connect these cards onto the motherboard.
Here is some detailed system information for the 18-GPU monster:...." [ https://devtalk.nvidia.com/default/topi ... -it-works/ ] . Just took the right search terms and source. Thus, that horse was surely just feigning death. Now that I lost that EVGA SR2 bid, guess where that Tyan booty is going?
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

Fascinating!

Does the http://www.trentonsystems.com/backplane ... -backplane require xeon cpus to cooperate with?

I also found such a monster
http://www.asrockrack.com/general/produ ... C612#Plist
It uses Backplane Board : 3U8G_BPB1 - but there is no info on it.

What do you think: what is required to connect such a massive backplane to a motherboard?
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
smicha
Licensed Customer
Posts: 3151
Joined: Wed Sep 21, 2011 4:13 pm
Location: Warsaw, Poland

.... I think this is the one (mobo) for the trenton backplane
http://www.trentonsystems.com/legacy/le ... host-board
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
User avatar
glimpse
Licensed Customer
Posts: 3740
Joined: Wed Jan 26, 2011 2:17 pm
Contact:

Hi Tutor.

I've seen the link You've dropped there a while ago =) & I've been curious how that works..the biggest question I have..why does Windows in some cases see cards, but we can't use them? Cos that might be the case - machine OS is goign to recognise those & we still would not be able to use them..

Curious to see You future experiments on this topic, & seems You have the same mobo, what is promissing to say the least =)

Smicha,

that rack fil of Phis is stuning! as for backplates, there are some products, like those from Trenton 18 x16=DDD
Post Reply

Return to “Off Topic Forum”