Best Practices For Building A Multiple GPU System
Smicha, one last thought from me...and I could be wrong, but that alternating arrangement looks like it is trying for quad-SLI PCIe 3.0 x16/x16/x16/x16. Is there a mode, or feature, of this ASUS board that controls the setting of a 'Gamer' mode or something, that it might have defaulted to when the 7th video port was accessed?
Win 10 Pro 64, Xeon E5-2687W v2 (8x 3.40GHz), G.Skill 64 GB DDR3-2400, ASRock X79 Extreme 11
Mobo: 1 Titan RTX, 1 Titan Xp
External: 6 Titan X Pascal, 2 GTX Titan X
Plugs: Enterprise
Mobo: 1 Titan RTX, 1 Titan Xp
External: 6 Titan X Pascal, 2 GTX Titan X
Plugs: Enterprise
There is a 'defaults' option but it was already on defaults when seen with 7, but surely before saving with defaults 4G must be enabled. Even forcing all pcie to work at x8 it still gives x16 on four of them. What curious is that when you look in bios at PCIE ports (this is not on the screenshot, there is also another option under advanced menu) all say that none is active which is clearly false compared to what is seen in another option (the screenshot). Definitely Asus WS sucks for more than 4 gpus and what I hate is that there is no response from them if they are working on it at all.Notiusweb wrote:Smicha, one last thought from me...and I could be wrong, but that alternating arrangement looks like it is trying for quad-SLI PCIe 3.0 x16/x16/x16/x16. Is there a mode, or feature, of this ASUS board that controls the setting of a 'Gamer' mode or something, that it might have defaulted to when the 7th video port was accessed?
So here is the plan: we decided to remove all power cables for 40 hours, including 24 pin from mobo. This shall discharge psu (keeping a battery and bios settings). On Thursday morning we'll connect all back and see it this changes anything. If not... we'll drain the water and remove gpus, reset bios, and test all gpus one by one with 4G (a must) as I did when assembling the loop.
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
- Tutor
- Posts: 531
- Joined: Tue Nov 20, 2012 2:57 pm
- Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute
Excellent idea to test the seven penetrated serially with one known working penetrator!Notiusweb wrote:Smicha, does that board even need 4G Decoding for 7 GPU? How many PCI slots does it have.
Being that you are in Poland I remember typing to PolishGinger, who was debugging adding additional GPUs, to try each GPU one by one, to identify that they are even working independently of one another. Although, this gets messy with a loop, I know. But why wouldn't Windows even see 7, are the slots busted, shut off, is there something not plugged in somewhere (just hypothetical questions for argument, not suggesting they are).
Anyway, what I do, and what you could do, is if you have an old air cooled GPU, you could try testing slots with a single air cooled GPU before you would reassemble with a loop (ie test in all, or a sample, of slots not recognizing cards).
I have a now-crappy GTI 660Ti with stock fans, and this little card has helped me so much in troubleshooting. If you plan on going big with a rig maybe a Best Practice would be to use an old card to debug when possible.
I'm not sure that the present issue is the past issue; so, I'm not sure he's already done it. Depending on how one approaches a problem, one might say that in the situation that you encountered, in the end the problem was found to have been generated by using V3 and it's need for allowance of a different/modified setting? Or one might say that in this case's beginning the only variable that changed before breakdown is the likely cause of breakdown. We may not want to entertain that notion because it might tend to take us down a road we don't want to travel for any number of reasons. There's the potential for the problem source being the only variable before breakdown - the system recipient plugged a DVI cable into one of seven GPUs where the six other GPUs had/have hacked off DVI ports - only one of the seven GPUs had what the recipient may have thought was a "potentially" working DVI port at that time. As a child, I used to play a game called "follow the leader" in a then heavily wooded area near our home. I then didn't possess the wisdom to be fully cognizant that if I strayed just a little bit off of the path that I might be in grave peril: there were rattlesnakes and copper heads in those woods. If you were to say, "Tutor, that's all well and good, but what does that have to do with our trying to debug a system where three GPUs now fail to work at all (they're so far not even showing up in Regedit) and three others appear problematic in Regedit (3 + 3 = 6) after the recipient of the system, contrary to the caution given by an assembler, plugged a DVI cable into the only one of the seven GPUs with an extant DVI port and there was then a failure of the system to properly handle six of the GPUs properly." I'd respond, (1) Why did you broach that excellent idea to test the seven penetrated serially with one known working penetrator! (don't you too believe that the problem could now be electrical in nature) and (2) "How were six of the GPUs different from the seventh GPUs and was there any risk inherent in those differences or in accessing the DVI port where the others DVI ports had been hacked off?" Could the penetrator (male) or the penetrated (female) have been electrically*/ affected, and if so to what extent. So here we have lots of rabbit holes to trace because we don't know for sure what all happened when Smicha's caution was violated.Notiusweb wrote:Tutor, one mind to another - he installed everything and it worked before, shouldn't he just replay what he did last time, put a piece of Duct tape to block the video input on the 7th GPU, and not worry about the registry ins and outs? He already did it, he solved his present issue in the past. I spent time on the V3 crashing thing because I dreaded undoing the Amfeltec portion of the loop. But once I did and reverted to USB risers it all worked, as I knew it would, but yes it was a pain. Seems like here Smicha is in a similar situation.
If you really love wabbit and you're confident there's a live wabbit there (fact: you saw the wabbit just enter that hole - akin to the system worked before), then there's a reason to continue. However, no wabbit or a long ago expired wabbit won't satisfy - akin to the slots or some GPUs may already be fried, which won't taste as good as fried wabbit. So if you feel that you must have fresh wabbit now and you just saw a live wabbit enter that hole, you may have to search that whole hole completely ( and/or find the other exit and some evidence or strong indication that the wabbit had a tendency to use that exit hole and did in fact use it recently to exit - akin to the aim of electricity to find a way to the ground/the earth or put another way). Were the cut DVI wires sufficiently shielded so that no errant current could make it's way onto a neighboring one, seeking it's route to earth. One may never know for sure whether it's wabbit for dinner or just McDonalds; however, I believe firmly that one must follow all of the leads until they each come to their individual ends. That's "What's up Doc."Notiusweb wrote:Why go further down the wabbit-hole?
*/ Electricity's movement and dangers remind me of the movement of some snakes.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
I love you!Tutor wrote:Could the penetrator (male) or the penetrated (female) have been electrically*/ affected, and if so to what extent. So here we have lots of rabbit holes to trace because we don't know for sure what all happened when Smicha's caution was violated.
3090, Titan, Quadro, Xeon Scalable Supermicro, 768GB RAM; Sketchup Pro, Classical Architecture.
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
Custom alloy powder coated laser cut cases, Autodesk metal-sheet 3D modelling.
build-log http://render.otoy.com/forum/viewtopic.php?f=9&t=42540
- Tutor
- Posts: 531
- Joined: Tue Nov 20, 2012 2:57 pm
- Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute
Ditto, my friend.smicha wrote:I love you!Tutor wrote:Could the penetrator (male) or the penetrated (female) have been electrically*/ affected, and if so to what extent. So here we have lots of rabbit holes to trace because we don't know for sure what all happened when Smicha's caution was violated.
Now, what if Nvidia and Intel could and did released GPUs and CPUs, respectively, that ran at YHz (Yotta) speeds [ compare http://www.answers.com/Q/Which_is_larger_GB_or_GHz with http://www.answers.com/Q/Which_is_large ... r_kilobyte ] so we might seem to need only one of each of them for rendering 4K, 8k and 16k virtual reality projects. Wouldn't that put an end to our multiple GPU (and my multiple CPU) madness? Probably not, because I'd want at least 24 of such CPUs for my systems - Oh, and some of my those systems would probably require dual CPUs and a few of them quad CPUs so lets add 11 more CPUs. Moreover, we all know how as much nature abhors a vacuum that I abhor empty PCIe slots (if thats what they're then called); so lets include about 100 of those YHz GPUs. In sum, the beatings will continue until morale improves.*/
*/ That saying comes from a poster proudly displayed by one of my workers at my first full-time job 38 yrs. ago. Needless to say, I now use other incentives to promote high employee morale.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
Smicha, sorry for not asking earlier...maybe installing cable to 7th GPU activated an onboard or integrated graphics driver in BIOS of some sort? Maybe that is still active and needs to be deactivated?
Just last minute thoughts...
Also, maybe you don't have to un-tube everything? Is it hooked into the GPU also? Maybe you can place it all on something like this next to PC, I don't know if you have slack or not, trying to come up with something.
http://www.amazon.com/Grayline-40700-Sm ... ge_o03_s01
Just last minute thoughts...

Also, maybe you don't have to un-tube everything? Is it hooked into the GPU also? Maybe you can place it all on something like this next to PC, I don't know if you have slack or not, trying to come up with something.
http://www.amazon.com/Grayline-40700-Sm ... ge_o03_s01
Win 10 Pro 64, Xeon E5-2687W v2 (8x 3.40GHz), G.Skill 64 GB DDR3-2400, ASRock X79 Extreme 11
Mobo: 1 Titan RTX, 1 Titan Xp
External: 6 Titan X Pascal, 2 GTX Titan X
Plugs: Enterprise
Mobo: 1 Titan RTX, 1 Titan Xp
External: 6 Titan X Pascal, 2 GTX Titan X
Plugs: Enterprise
- Seekerfinder
- Posts: 1600
- Joined: Tue Jan 04, 2011 11:34 am
Smicha, like Notius I'm also thinking about ways you might avoid draining & dismantling. I'm sure you've considered this but if it were me I'd think of ways to 'remove' the mobo from the set rather than the other way around. Depends on the case and if you have any flexibility in the loop, but I guess not? Also, getting to the slot-clips would probably be hard. If you could manage somehow freeing the board, 16x (high quality) risers could help with trouble shooting. But they're expensive as well.
Just thinking out loud. I feel for you on this one... Keep us posed.
Seeker
Just thinking out loud. I feel for you on this one... Keep us posed.
Seeker
Win 8(64) | P9X79-E WS | i7-3930K | 32GB | GTX Titan & GTX 780Ti | SketchUP | Revit | Beta tester for Revit & Sketchup plugins for Octane
- Tutor
- Posts: 531
- Joined: Tue Nov 20, 2012 2:57 pm
- Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute
Notiusweb and Seekerfinder,
What's up Docs. Those are creatively great ideas. You guys got down deep into the wabbit hole far enough to see the light coming in from the other escape routes. Another alternative would be to use a wooden crate placed upside down (and above the naked motherboard or over the case - depending on the depth of the case and on how the wooden crate is modified) so as to make electric shorts most unlikely. The use of Seekerfinder's idea of using an x16 riser to test each card and each slot individually might aid in trouble-shooting to rule out a bad GPU or PCIe slot, without having to first take apart the loop or drain it if all of the GPUs can be moved onto the crate as a group. The x16 riser cable can easily be moved from one card or slot to another card and/or slot for testing. Having seven such risers might ease the task of progressively getting the whole system working again. Then just remove the risers (and keep them for future use in service or trouble-shooting) and lower the cards back into place.
What's up Docs. Those are creatively great ideas. You guys got down deep into the wabbit hole far enough to see the light coming in from the other escape routes. Another alternative would be to use a wooden crate placed upside down (and above the naked motherboard or over the case - depending on the depth of the case and on how the wooden crate is modified) so as to make electric shorts most unlikely. The use of Seekerfinder's idea of using an x16 riser to test each card and each slot individually might aid in trouble-shooting to rule out a bad GPU or PCIe slot, without having to first take apart the loop or drain it if all of the GPUs can be moved onto the crate as a group. The x16 riser cable can easily be moved from one card or slot to another card and/or slot for testing. Having seven such risers might ease the task of progressively getting the whole system working again. Then just remove the risers (and keep them for future use in service or trouble-shooting) and lower the cards back into place.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
- Tutor
- Posts: 531
- Joined: Tue Nov 20, 2012 2:57 pm
- Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute
Notiusweb wrote:Smicha, sorry for not asking earlier...maybe installing cable to 7th GPU activated an onboard or integrated graphics driver in BIOS of some sort? Maybe that is still active and needs to be deactivated?
Just last minute thoughts... :
... .
The only way that I know of how to reset the bios is the remove the motherboard's battery (and short the + and - with a small screwdriver or similar metal object). Because we're neither GPU nor Asus motherboard service techs and don't have access to the thing itself, potential scenarios and solutions like that rightfully run through our minds. Now both you and Seekerfinder have most recently given examples of alternatives that would potentially overcome Smicha's dilemma with getting the system to boot. Crreative use of mind power - isn't it great.
Last edited by Tutor on Wed May 11, 2016 5:52 am, edited 1 time in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.
- Tutor
- Posts: 531
- Joined: Tue Nov 20, 2012 2:57 pm
- Location: Suburb of Birmingham, AL - Home of the Birmingham Civil Rights Institute
Seeker,Seekerfinder wrote:... . I feel for you on this one... .
Seeker
I've had people before say to me, "I feel for you ... ." But rarely did they precede it by live evidence/help/mental brain power as have you and Notiusweb. Smicha truly has many people who love him and they're scattered all over this tiny world. Love acts.
Last edited by Tutor on Wed May 11, 2016 5:56 am, edited 1 time in total.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.