Best Practices For Building A Multiple GPU System

Sun Dec 11, 2016 2:34 am

I hope that Trenton prices have been substantially reduced since our prior excursion - viewtopic.php?f=40&t=43597&p=238270&hil ... on#p238085 - that made me remain a Supermicro Man.

Sun Dec 11, 2016 10:09 am

Tutor wrote:I hope that Trenton prices have been substantially reduced since our prior excursion - viewtopic.php?f=40&t=43597&p=238270&hil ... on#p238085 - that made me remain a Supermicro Man.

I remember this thread, Tutor - great source of information on prices. Hope the price of the backplane only (with a slave-master PCIe link system) does not exceed $2k and it will work on any dual xeon motherboard.

Mon Dec 12, 2016 2:30 pm

Smicha/Tutor, when designing builds currently, how do you think you prioritize the rig:
(1) It should be a compromise of handling both creative applications and render applications, and cost
(2) It should maximize potential of creative applications
(3) It should maximize potential of rendering application

My experience led me to initially view #3 as most important, but then as I began to take on more creative applications I found my thinking going from 3, to 1, and then to 2.
This makes me ponder about a future Mac build one day, with (a) greater powered GPUs, and (b) less count of GPUs... (as opposed to a greater number of lower powered GPUs).

Mon Dec 12, 2016 3:37 pm

Smicha,

I should have the price of that board within the next hour. Trenton's location is just about a two hour drive from me.

Mon Dec 12, 2016 3:40 pm

Tutor wrote:Smicha,

I should have the price of that board within the next hour. Trenton's location is just about a two hour drive from me.

Tutor,

Thank you. Could you please ask them for the PCIE-PCIE link connection system (two PCIe devices with a cable that joins host and a slave).

Mon Dec 12, 2016 3:46 pm

Notiusweb wrote:Smicha/Tutor, when designing builds currently, how do you think you prioritize the rig:
(1) It should be a compromise of handling both creative applications and render applications, and cost
(2) It should maximize potential of creative applications
(3) It should maximize potential of rendering application

My experience led me to initially view #3 as most important, but then as I began to take on more creative applications I found my thinking going from 3, to 1, and then to 2.
This makes me ponder about a future Mac build one day, with (a) greater powered GPUs, and (b) less count of GPUs... (as opposed to a greater number of lower powered GPUs).

With a great variety of new Xeons V4 point #(1) is achievable at reasonable costs. 2x2620 with 3GHz boost is great for almost all works. Almost - the only bottleneck here is Adobe software (AE mainly) which is not optimized for xeons and a 4-6 core i7 beats xeons. So if there is a budget for 2x2667W (+3000EUR add to 2x2620) these may become handy.

Mon Dec 12, 2016 8:09 pm

Be Guided By Your Principal Need(s)
Also, Daisy Chaining - An update to this post - viewtopic.php?f=40&t=43597&start=650#p286419 . Ten PCIe devices, excluding two Amfeltec x4 GPU Oriented Splitter kits, working simultaneously on a four PCIe slotted MacPro. But if you include the two splitter cards; then it's 12 PCIe devices running at the same time on a four PCIe slotted MacPro.

Notiusweb wrote:Smicha/Tutor, when designing builds currently, how do you think you prioritize the rig:
(1) It should be a compromise of handling both creative applications and render applications, and cost
(2) It should maximize potential of creative applications
(3) It should maximize potential of rendering application

My experience led me to initially view #3 as most important, but then as I began to take on more creative applications I found my thinking going from 3, to 1, and then to 2.
This makes me ponder about a future Mac build one day, with (a) greater powered GPUs, and (b) less count of GPUs... (as opposed to a greater number of lower powered GPUs).

I consider the nature of my output to be the primary consideration, namely, animation and video to be the goal. My selections are thus constrained by the needs of animation and video. Accordingly, because of the extremely high frame count rendering needed for animation & video, I need many high GPU count renderers (Item no. 3 in your list). My goal is to one day (soon) have 5 x ~20 GPU Supermicro X9DRX (dual [aka] Intel E5-2680 V1s CPUs ) renderers, plus my dual X5690 CPU Tyan GPU Server (nominally an 8x GPU server/renderer; but I've learned along the way that 8 isn't the limit with Linux). Additionally, at least one of those high GPU count renders will do double duty by also running one or two Linux versions of creative apps (Item no. 1 in your list). The high GPU count renderers do and will run Linux Mint primarily because Linux is the only OS that works in that scenario and I like the Mint version the best. Linux's being free also doesn't bother me. My Tyan system has a secondary task of being to run a few Windows' based creative apps (Item no. 1 in your list). My systems in which I've housed my 23xGTX 590s, 6xGTX 480s and 1xGTX 690 will continue to be used mainly with Thea bucket rendering [ http://presto.thearender.com/bucket-rendering.html ] (Item No. 3 of your list). Moreover, my recently more upgraded 2012+ MacPro (it's really a 2009 MacPro that's been hacked/upgraded in various ways over the last seven years)**/ renders 3d animations and supports the Mac versions of some creative video editing apps ( Not sure how it fits in your list). Since I like my Mac system the best of all (vs. my other systems), for running creative apps (and most of the those that I've purchased in the past and am more familiar with) run on the Mac, that MacPro system has been, is, and will be my primary system for that aspect of the my creative processes.

On the other hand, if I did primarily single frame renderings I'd opt for Item no. 1. If I did primarily video editing I'd opt for Item no. 2.

*/ My experience with Linux leads me to believe that even with systems that top out of IO space at 13 or 14 GPUs (or even smaller numbers of GPUs) when running them under Windows will run more GPUs when running Linux. Linux does a lot better job than does Windows (2nd place) or MacOS (last place among the major three OSes) in maximizing IO space for GPUs.

**/
Internally, it's got, among others, (a) OS Sierra, (b) dual X5690 CPUs, (c) 128G of 1,333 MHz ram, (d) 4x2T WD HDs (total 8T internal HD storage), (e) 2xSonnet Tempo SSD Pros (TSATA6-SSDPR-E2) ea. w/ 2xCrucial MX300 2TB SATA 2.5 Inch Internal Solid State Drives - CT2050MX300SSD1 SSD cards (that's 4x2T or a total of 8T of SSD storage). Each Tempo card has two E-SATA external ports and the four of those external ports (for both Tempo cards) are connected to a Sans Digital external HD case with 4x2T WD HDs (total 8T external HD storage). In total, the the system has access to 24T of storage, excluding that on the mini-HDs. The front DVD drive bay on that MacPro has two HighDefinition DVD drives. Also internally, in the bottom one of the Mac's four PCIe slots, is an AMD/ATI R9 480 8192 MB video card that's also useful for OCL rendering. In each of the next two upper PCIe slots are the Tempo cards mentioned earlier. And in the last (or highest) PCIe slot is an x16 to x16 (POWERED!) riser cable.

Externally, there're 2xAmfeltec x4GPU Oriented Splitters - the 2 of them are connected serially***/ and the first one in the series is fed data from that x16 (POWERED!) riser cable). For the time being are 4xGTX 780 Tis (ext-cards nos.1-4); one 4-port USB 3.0 card (for connecting to and sharing data with my Mini-HDs) (ext-card no.5), an Intensity Pro Audio Card (ext-card no.6), a one Port E-Sata card (for connecting to and sharing data with a Mini-HD for my MacBook Pro) (ext-card no.7), and for OCL rendering needing more powerful performance is my ATI 6870 (ext-card no.8). Thus, I have a total of 11 PCIe devices, counting both the 3 internal ones and 8 external ones) for that MacPro, but as is fully described below, going this route allows me to connect only 10 of them at any one time. And for the curious among you, four Nvidia GPUs appears to be the limit. But because of how my ATI cards allocate I/O space differently from my Nvidia cards, I can get two ATIs running at the same time as the four Nvidia GPUs. So for OCL rendering I can get, at least, 6 GPUs running on the Mac, at least in this configuration.

***/
Running the two Amfeltec Splitters (both wrapped completely in high quality Scotch Professional Grade Super 88 Vinyl Electrical Tape) serially results in the first x4 Amfeltec card being able to provide support directly to 3 individual PCIe devices because the second x4 Amfeltec cards uses one of the first card's four x16 PCI Express adapter boards,; but the second Amfeltec x4 can directly support 4 PCIe other devices with its four x16 PCI Express adapter boards. Thus, I can run only 7 of the 8 external devices that I've listed (above) at anyone time. However, moving one Amfeltec x16 PCI Express adapter board from a device to another device takes just a few seconds. But make the x16 PCI Express adapter board switch only after powering down everything! On top on the MacPro is an EVGA 1600W PSU to power the Amfeltec splitters and the GPUs. Everything (including the MacPro, the EVGA PSU and the external HDs) is powered from one 1800W max wall power source. When rendering and when doing whatever else that I've tried, the max wattage for everything has never reached anywhere near 1800 watts with peaks included; in fact, the highest reading that my KIll A-Watt meter has recorded was 1308W.

Tue Dec 13, 2016 3:55 am

If you’re running GPUs under Linux or MacOS or even so many GPUs under Windows that you’re finding it difficult to tune your GPUs for maximum performance because of the lack of support from MSI's Afterburner or EVGA's Precision, and you have access to a Windows system and are willing to take full responsibility for your own tweaking of each of your GPUs’ performance, then visit https://www.techpowerup.com where you can download GPU-Z which will allow you to first save each of your GPUs’ factory bios [ https://www.techpowerup.com/downloads/2 ... -z-v1-16-0 ]. Then you can mod the bios of each Kelpler or Maxwell GPU with the appropriate bios tweaker [ https://www.techpowerup.com/downloads/U ... S_Modding/ ] (Pascal bios tweaker has yet to drop, but a few Pascal bios mods may have already been uploaded - https://www.techpowerup.com/vgabios/ ) . Then you can flash each of your GPU's bios mods with the appropriate version of NVFlash [ https://www.techpowerup.com/downloads/U ... ng/NVIDIA/ or see https://www.techpowerup.com/downloads/U ... shing/ATI/ for ATIFlash]. Any and all risks/damages are solely your own.

Tue Dec 13, 2016 10:43 am

Tutor wrote:If you’re running GPUs under Linux or MacOS or even so many GPUs under Windows that you’re finding it difficult to tune your GPUs for maximum performance because of the lack of support from MSI's Afterburner or EVGA's Precision, and you have access to a Windows system and are willing to take full responsibility for your own tweaking of each of your GPUs’ performance, then visit https://www.techpowerup.com where you can download GPU-Z which will allow you to first save each of your GPUs’ factory bios [ https://www.techpowerup.com/downloads/2 ... -z-v1-16-0 ]. Then you can mod the bios of each Kelpler or Maxwell GPU with the appropriate bios tweaker [ https://www.techpowerup.com/downloads/U ... S_Modding/ ] (Pascal bios tweaker has yet to drop, but a few Pascal bios mods may have already been uploaded - https://www.techpowerup.com/vgabios/ ) . Then you can flash each of your GPU's bios mods with the appropriate version of NVFlash [ https://www.techpowerup.com/downloads/U ... ng/NVIDIA/ or see https://www.techpowerup.com/downloads/U ... shing/ATI/ for ATIFlash]. Any and all risks/damages are solely your own.

This is the way I was fighting with 580s instabilities - batter voltage settings helped a little but finally I RMAed them. The risk is greater especially when very long rendering comes to play. And it is mainly due to voltage, not heat when using watercooled GPUs. What is neat - even on stock voltage - Pascals do clock very high and custom modded bios is not that important, unless you want to break a limit for short benchmarks. But as said - for long rendering I would not exceed the limit of stock voltage (around 1.2V).

Tue Dec 13, 2016 3:49 pm

smicha wrote:...

This is the way I was fighting with 580s instabilities - batter voltage settings helped a little but finally I RMAed them. The risk is greater especially when very long rendering comes to play. And it is mainly due to voltage, not heat when using watercooled GPUs. What is neat - even on stock voltage - Pascals do clock very high and custom modded bios is not that important, unless you want to break a limit for short benchmarks. But as said - for long rendering I would not exceed the limit of stock voltage (around 1.2V).

I agree that clock speed enhancements that depend on exceeding stock voltages aren't recommended, particularly for bios modes because they're a bit harder to correct. Also, I spoke with a Trention sales rep a couple of times yesterday for quotes on the parts that you need, but haven't received any prices yet. Having received no response within more than 24 hrs seems a lot strange to me for products they're marketing.