External Graphics Cards PC

Tue May 26, 2015 7:11 pm

Notiusweb wrote:That's great Tutor, and inspirational - taking an obstacle and creating innovation from it. We all face the world, and either we move forward by exploring, learning, and building, or by tearing things and others down to create the illusion of moving forward. i don't want an illusion, unless it is a 3D rendered image! I have to say, I'n not going to be happy with a plug and play solution unless I can explore pushing the boundaries. I am looking at Amfeltec gear too. In my own experience I'm finding that OCd cards require more attention than their stock clocked twins, I wonder what makes the better expansion, uniformity (adding same identical cards), or upgrade (newer different card, faster OC).

Notiusweb,

I hope that you are having a healthy, happy and prosperous time.

I see abundant wisdom pouring from your mind and your confronting the challenges you're facing. I wholly agree with you that "OCd cards require more attention than their stock clocked twins." The terminal, bolded "s" denoting plurality are key. Having and managing a single OCed card isn't usually any more significantly difficult to manage than managing a single reference card, so long as the user isn't trying to derive any more performance from either card than is exhibited by the card when first installed. If either of those single (un-user-tweaked) cards satisfies the user's needs, then all is well.

But not everyone has the same needs. If one is, for example - just as I am, trying to build his/her own animation render farm, then using just a single GPU isn't (yet) feasible to satisfy that need. In fact, for me, having only one system loaded with GPUs (and subject to Otoy's 12 GPU per machine license limit and 12 GPU per single image render license limit) is currently insufficient and unsatisfactory. However, I don't have enough funds to fully build my render farm to completion (?) in one fell swoop; so loading each system with identical GPU cards is neither feasible, nor optimal. I have to work at achieving my goal incrementally. Those increments span time. Over those spans of time, newer, faster GPUs become available. Thus, it is imperative that one in my situation face each obstacle and creatively innovated from it. Facing that challenge requires "exploring, learning, and building" and since I do not have infinite funding it also may require, as it is at this juncture of my system construction, tearing down some of the systems that I've built in the past, which is where I'm at now.

Continuous learning is paramount to my rendering capability expansion. We can learn from others who might have different reasons for achieving the same objectives. Recently, I been exploring the exploits of my distant cousins who are bit-coin miners. Their needs are similar, although their ultimate objective may, at first blush, appear to be different. Consolidating my GPUs into fewer systems is now dictated by space, electrical power, license cost (per system), rendering speed, acquisition of newer, faster GPUs and efficiency concerns, among others. Most of these concerns aren't alien to bit-coiners.

I consider every human on the face of this planet to be one of my relatives. I love all of my relatives. That love dictates that I not horde my experiences and help my relatives avoid my mistakes and to achieve my successes. Here they are:

To retain some of the value in my past expenditures for computer system hardware, I've have had to learn more about the issues that arise from GPU consolidation and explore "External Graphics Cards," requiring me to fully explore Amfeltec splitters and riser cards, among other alternatives. Before starting this journey, I never considered personally using splitters and riser cards, although I did have a understanding of their existence and a general understanding of their functionality. Now (assisted by following the exploits of bit-coiners), I'm seeing how splitters and riser cards can be integrated into my previous and future builds. In the case of the risers - I see that they will enable me to access available PCIe slots that had previously been covered by a double wide GPU. In the process of rereading anew about the functionality of those covered slots [ hidden functionality explained in plain view in the user manual - none is so blind as he/she who will not see ] , I learned recently that half of my earlier built (socket 1366) systems were not three GPU PCIe slotted systems (as I had earlier believed), but are, in fact, four GPU PCIe slotted systems. My err was that I was seeing only what I wanted to see at that particular time, not all that was potentially there. Now, I see that I just need risers to be able to access those previously covered slots. Additionally, using risers strategically will also allow me to better cool the GPUs in those systems by increasing GPU spacing; thus, increasing the longevity, performance and performance potential of those GPUs. Using the splitters in conjunction with the rises will allow me to achieve maximal GPU physical consolidation per system, yielding more empty PCIe slots for future expansion. That future expansion can only be maximally exploited by my sharpening, honing and hardening my I/O chops, my learning how to exploit the features for doing so in the Device Manager, and my exploring the best practices for building a multiple GPU system [ http://render.otoy.com/forum/viewtopic.php?f=40&t=43597 ]. I'm confident that I'll continue to experience failures and setbacks along the way because perfection is most allusive. But I will not be deterred, for even failures and setbacks provide opportunities for learning if we chose to learn from our mistakes and those exposed to us by others. We just have to have the courage and the determination to keep on pushing forward. Pushing forward is easier when we, with like aims, push together.

Sorry for the long intro, but I felt it was necessary to put your ultimate question, " I wonder what makes the better expansion, uniformity (adding same identical cards), or upgrade (newer different card, faster OC)," in the proper perspective.

Here's my thoughts about putting GPU-related things into the proper perspective:

1) Is this a hobby or a business backed goal? If it's a hobby, then I'd uniformly opt for upgrades over uniformity. If it's a business backed goal, then read on.

2) Analyze and then define your current and long-term business goal(s). Don't neglect to define as best as possible your ultimate GPU needs. I know there'll be a degree of guessing involved, but that's a necessary part of the process. My needs may not be yours and some who have needs similar to mine may prefer the web render route; but I've determined that I cannot rely on something that doesn't yet exist and particularly relies so heavily on the web. Thus, I currently have 116 CUDA GPU processors that have the rendering equivalency of just over 86 GTX 980s ( I did the math using OctaneBench performance scores and double checked it). There's a great variety of GPUs included in my count, some GTX 480s to some Titan Z's, and every top-end GPU between them. My network has Maxwell Equivalent (ME) of over 180,000 CUDA cores. If your GPU needs are few (i.e., no more than about 4 per system), mixing GPUs (as part of an upgrade process) should not be much of a problem. It's only when you want to derive the ultimate in GPU consolidation (> 5 GPUs per system) that IO management issues appear to start becoming major concerns. GPU servers such as those made by (1) SuperMIcro and (2) Tyan are best for effortless management for massive GPU deployments. For massive GPU deployments, having identical GPUs makes their management easier, but also increases the upfront costs. It appears currently that there will be newer, faster GPUs for the foreseeable near-term future. That's just a phenomenon that we have to live with. Finally, don't forget to record your current and long-term goal(s); look at them regularly and modify them as needed.

3) Keep abreast of technological advances relevant to your business model; analyze how technological advances are/will affect your goals and always keep track of opportunity trickle down over time. When I first began this journey about five years ago, if someone had put the question to me - "How many GPUs will you need in five years?", I would have said, "My GPU needs are few - three per system and about five systems." So try to answer that question, not based solely on where you are now, but based on what is your ultimate business goal. A part of my error was my not fully assessing what would be my ultimate GPU needs if business grew and opportunities/capabilities that where then only within the domain of the BIG studios were to trickle down. Another one of my errors was that I did not fully explore maximizing GPU consolidation from day one. Keep in mind that opportunity trickle down will occur much more and more rapidly as technological advances occur and the price/cost of entry to obtain large projects or parts of them will drop significantly faster over time. What compute performance will you need to secure those business opportunities and what GPU compute capability is on the immediate horizon?

4) Among identical GPUs of a particular family, there is usually some amount above the base clock that every GPU can be tweaked. This is the easy tweaking stage. But not every identical GPU of a particular family can be tweaked similarly above that amount. At his stage, this would be similar to the situation that you'd face if the GPU models varied. So in either case (having identical GPUs or having different GPUs), to maximize compute performance individual tweaking would be necessary to maximize compute performance. The more GPUs that you have, the longer this sort of tweaking will take. If and when you have the time to do this individualistic tweaking, then its worth it. If you don't have the time to do it, then it's likely either that your business is too plentiful or that you need to judiciously use whatever downtime that your network has to tweak your GPUs and/or add more GPUs.

5) As you acquire more GPUs, individual tweaking tends to become less of a concern/imperative. As you acquire more GPUs, the impulse to acquire the next great GPU can be held at bay for longer periods of time to help reduce your acquisition costs. A significant number of my GPUs are used GPUs that gamers sold to get the next best thing. A significant number of my GPUs are refurbished GPUs. I've never tested the video display output of the majority of my GPUs. Why? Because I don't care about that (video display output ) for the vast majority of them. So long as the GPUs do CUDA computations properly, that's all that I care about. In other words, set and be willing to adjust your priorities/expectations based on what's most important and consider what's the best expenditure of your money.

6) Having newer different GPUs in the same system almost might mean that one has GPUs with different memory limitations. Even with the addition of out-of-core texture functionality one's system's rendering ability might be limited by the GPU(s) with the least amount of memory. Even having newer different GPUs in different systems, when the GPUs within any one system are identical, might pose the same limitation when performing network rendering. This includes even when performing native network rendering apart from the single frame network rendering functionality recently added to V.2 of Octane by Otoy. Thus, in some situations, those who are animators with different GPUs may have to develop and use compositing workflows taking into account GPU memory limitations, requiring them to even design with that in mind.

Mon Jun 01, 2015 1:05 pm

Hey guys,

just thought I'd quickly post my experience extending my 2010 dual hexacore MacPro with an Amfeltec GPU cube. It worked beautifully for about 48h. Then the backplane board supplied by Amfeltec caught fire! Had a day's worth of renders go up in smoke! Well, not really, but am now without readily available render power. I had 4 x 780 Ti cards in there, made working with octane very quick!

I'm told the smoke is not part of the backplane board and they're express mailing a replacement. Let's hope it'll get here before the weekend...

Cheers

- Balt

Mon Jun 01, 2015 1:18 pm

balt wrote:Hey guys,

just thought I'd quickly post my experience extending my 2010 dual hexacore MacPro with an Amfeltec GPU cube. It worked beautifully for about 48h. Then the backplane board supplied by Amfeltec caught fire! Had a day's worth of renders go up in smoke! Well, not really, but am now without readily available render power. I had 4 x 780 Ti cards in there, made working with octane very quick!

I'm told the smoke is not part of the backplane board and they're express mailing a replacement. Let's hope it'll get here before the weekend...

Cheers

- Balt

as they say, s*^% happens =) hope replacement will work as it should!

Mon Jun 01, 2015 4:35 pm

Hum,
Never let yet my rig full load more than 2 hours. Even all are watercooled (except the 2 x1600W power supply), your experience give me some fears !

I should place a smoke detector nearby !

Mon Jun 01, 2015 8:47 pm

I rendered 80 hours continuously - cool and quiet

Mon Jun 01, 2015 9:07 pm

Amfeltec has been super professional about this incident. A new backplane is being fedexed as we speak. It looks like possibly one of the ground connectors on the PCI-E power supply for the GPU has gone bad. It overloaded the ground return on the PCI connector with +>7A, that made the smoke come out. Got some measuring to do on those connectors now...

Of course that stuff is designed to run continuously. Otherwise, what's the point?

Cheers

- Balt

Thu Jun 04, 2015 7:41 am

Hi,
Just replace the 580GTX by another 780Ti. Same pb on windows for the board. I have to edit registry and add the two key for the added 780Ti to be usable on octane.
I noticed that it's not the case for Windows 10. All the eight boards are recognized and work straight without registry modification. And windows 10 performs faster hardware detection.

Thu Jun 04, 2015 5:12 pm

Hello!

First, Tutor-
Thanks for the counsel. I am hobby level right now, continually moving in the enthusiast direction, not working towards it being a business venture, but whop knows!....The business-oriented advice you gave I think is relevant to both a business enterprise making purchase/operation/redundancy decisions, as well as any hobbyist pursuing their artistry, similarly dealing with decisions on efficiency in order to create art. We all are learning that achieving technical efficiency is an art unto itself!

As a hobbyist, what I am learning beyond GPU maximization, is that my art will be further maximized if I push my own comfort boundaries too. I have recently stepped outside the comfortable prebuilt confines of programs like Daz Studio and have been looking at the open-verse of Sketchbook, 3DS Max and Blender, where I can create my vision from scratch. This process of re-considering the applicant technology of art does not worry me though, I have taken this road many times via music-production (Midi keyboards and soundcard optimization with FLStudio, Cubase, Kontakt, etc.) and with traditional media (using tablets and Cintiq with Corel Painter, Adobe Photoshop, etc.)

But enough about that!.....Is it possible to network together a 2nd whole motherboard scenario for additional GPU power on a single instance of Octane? Is that what a 'back-plane' sort of is? I am trying to read up what that would involve (ie sepratae PSU, DRAM, CPU, OS component) but maybe am looking in the wrong places. Like you, I draw inspiration and schematic from the Bitcoin miners, and want to push the boundaries for the end purpose of faster rendering for the thousands of frames needed for animations.

Smicha - 80 hours! That is Cool! I get nervous beyond 2 hours

Balt-
sorry to hear about your experience with Amfeltec's GPU cluster. Did they suggest a fan blowing on the actual cluster PCIE board, wonder if that might have helped? Not to blow the smoke away, but rather to cool off the physical unit itself. Did they make you return it as part of an exchange, or are you just getting another? Wonder if you will find any physical difference. Are you going to modify your card set-up config anything when you receive the next part, or are you going to try the same configuration again?

itou31-
I myself am following your posts originally as I wanted to added more GPUs. Thus far I cannot get past 10 (as 'devices' counted by Octane), no matter how I arrange. I originally wanted to have 5 Titan Z as rendering and Titan X as primary, but in my situation with it equaling 11, no work, results in error code on my Intel Asus X79 Extreme 11 mobo of "bF" (error is pre-bios, so its not something I can work around anmd figure out on my own...even is BIOS tried upping voltage, did not succeed in allowing more GPU.). I can have 5 Titan Z (10) , or Tian X, 4 Titan Z, and one 660 Ti (10), but never the 11. When I do have 9 or 10, sometimes 1 of the Z's 2 cores will not be there upon loading up, and if I power down and reboot, it comes back. This might be a power thing, but from reading your posts, might be a Win 7 thing (I am using Win 7 64...). Even if my mobo manufacturer releases a BIOS update to allow more power/functioning for more GPUs, I might hit a Win 7 brick wall...Anyway, thanks for updating us, and keep posting your progress, enjoy reading about it!

Regards to all!

Fri Jun 05, 2015 2:12 am

Well, the card is due for delivery today. They wanted the broken card back (like any conscientious business would: they need to find out what happened).

Their suspicion at this stage is that one of the power connectors to the GPU in the slot where the burn originated has a bad earth. That would result in 25A going through the PCI socket, which would do the kind of damage that we see. A fan would not have helped this in any way. Fire is not generally a result of insufficient cooling, it's usually a result of either a short, or way too much power coming through a narrow conductor...

Before sticking things back together today I'll need to measure all the power cables which is gonna be a pain in the neck...

Cheers

- Balt

Edit: I've been rendering frames while waiting for the replacement card. I thought the speedup by the GPU cluster was less than it actually is... currently it takes 18h per frame to render (one GTX680). With the cluster on and running, it takes 1.8h per frame. So the speedup is 10x: 10 x 680 = 4 x 780Ti...

Fri Jun 05, 2015 7:28 am

Notiusweb wrote:Hello!

First, Tutor-.....Is it possible to network together a 2nd whole motherboard scenario for additional GPU power on a single instance of Octane? Is that what a 'back-plane' sort of is? I am trying to read up what that would involve (ie sepratae PSU, DRAM, CPU, OS component) but maybe am looking in the wrong places. ... .

Hello Notiusweb,

1) My understanding is that to network a 2nd whole motherboard for additional GPU power for Octane, would require two licenses of Octane. A back-plan like the one sold by Amfeltec, isn't, on the otherhand, a whole motherboard. It's more like a solidification of the four-in-one splitter kits that Amfeltec sells, except there's just one board holding the four PCIe slots to plug additional PCIe devices into. But, since it communicates with the main motherboard, in part, through one of the PCIe slots on the motherboard, a user of the back-plane would gain 3 x1 slots total (4-1=3), given the occupation of one of the main system's PCIe slots by the back-plane connector card. Moreover, Amfeltec's backplane's signal communication speeds are on par with those of the Amfeltec chassis, which means that both communicate with the main motherboard at x1 speed (thus slower than the GPU-Oriented x4 PCIe 4-Way Splitters). That's why I've chosen to use the GPU-Oriented x4 PCIe 4-Way Splitters. Otherwise, the GPUs will compute at their internal speeds once they've received the data to be computed.

2) To help you with your systems' instability issues when the GPU count reaches 9 or greater, I tried to find a manual for the Asus X79 Extreme 11 motherboard, but I couldn't find one. However, I did find a manual for the AsRock x79 Extreme 11. That motherboard has a setting for PCI ROM Priority. If your motherboard is similar, it might have a similar parameter. The AsRock manual [ http://download.asrock.com/manual/X79%20Extreme11.pdf ], on page 94 says, "Use this item to adjust PCI ROM Priority. The default value is [LegacyROM]." To test whether I could maximize IO space, if I owned your motherboard, I'd try (a) using a different PCI ROM Priority, (b) looking for any option in bios that allows PCIe devices to use 4G space and make sure that it's enabled and (c) disabling in bios any functions that I'm not using (trying each separately before trying them in combination). If none of these steps resulted in success, at least I could say that I've tried to access, though bios, the parameters that might have likely hold the key to my getting 11 GPUs working - Just a thought or two or three.

P.S. What CPU is installed in your system?