Page 5 of 6
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Sat Nov 03, 2012 7:12 pm
by jmfowler
T- yeah it worked rendered without a problem overnight - any particular reason why this happens? overheating card?
Now I have to work out whats the maximum card speed i can get without a freeze up...I took the 580 from 772 to 750.
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Sun Nov 04, 2012 3:52 am
by TBFX
jmfowler wrote:any particular reason why this happens? overheating card?
I don't think it's heat related as it's my card number 3 that gives me this issue which is second to lowest in my case and never gets anywhere near max temp even when I have the fans on auto. It's probably just a chip that's not quite up to spec, I mean all chips will fail eventually as you overclock them and all at different rates so although the factory settings should be safe the chips are all going to be a little different in their abilities and with Octane we hit them pretty hard, certainly a lot harder than any game would. I took my troublesome card down to a core clock speed of 760, but as I said each chip will most likely be different.
T.
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Sun Nov 04, 2012 10:41 pm
by jmfowler
TBFX wrote:jmfowler wrote:any particular reason why this happens? overheating card?
I don't think it's heat related as it's my card number 3 that gives me this issue which is second to lowest in my case and never gets anywhere near max temp even when I have the fans on auto. It's probably just a chip that's not quite up to spec, I mean all chips will fail eventually as you overclock them and all at different rates so although the factory settings should be safe the chips are all going to be a little different in their abilities and with Octane we hit them pretty hard, certainly a lot harder than any game would. I took my troublesome card down to a core clock speed of 760, but as I said each chip will most likely be different.
T.
well speed 760MHZ on my 580 is working well, so I'll do more test later in the week to see if I can squeeze a bit more.
Jimstar - Do you think it would be possible for the octane programmers to program to detect a failed GPU and have the renderer switch all work off that GPU to the others (if there are some) AND also give a message in the verbose to state that "GPU # has failed" ? I lost a few overnight renders to this problem for an animation and it would have been better for Octane to 'self heal' - otoy wants to make a render farm for Octane but they will need something like this if they don't want their farm to fail too easily....
jimstar - If TBFX and I have both had this problem, I think it will happen to others with multiple card systems - has it happened to you?
cheers,
JF
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Mon Nov 05, 2012 2:24 am
by p3taoctane
I am having the same problem of a card falling out in the middle of a render. I have heard power issues and clock speed issues as a way to fix this... but the question I have always had is ...
Is there a way to correlate card numbers with actual slots in your machine. For example if card three always fails does that always correlate to slot number X? In other words when octane loads up does it sequentially load up from slot one to 4 (as an example) and always do so in that order. (Same for evga monitor... I will notice I get a failed card and in EVGA's numbering scheme it is not always card 3 as an example... but it is always only one that fails)
I am trying to trace which card it is without opening up the case and doing the process of elimination method etc.
Thanks.
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Mon Nov 05, 2012 3:13 am
by TBFX
p3taoctane wrote:Is there a way to correlate card numbers with actual slots in your machine. For example if card three always fails does that always correlate to slot number X? In other words when octane loads up does it sequentially load up from slot one to 4 (as an example) and always do so in that order. (Same for evga monitor... I will notice I get a failed card and in EVGA's numbering scheme it is not always card 3 as an example... but it is always only one that fails)
I'm fairly certain that in my system EVGA precision always numbers my cards 1 through 4 from top to bottom. The relative temperatures the cards run at and fan speeds when on automatic certainly indicate this and I just did a quick check by turning each cards fan onto full in turn and checking the airflow out the back so I believe in my system anyway EVGA precision always numbers them this way.
Octane for maya is another story though as the numbering is not consistent with EVGA. I believe for me the mapping is as follows
EVGA ------ OctaneForMaya
1-------------1
2-------------3
3-------------2
4-------------4
So if I set GPU's 2 and 4 to use for rendering in Maya it will use 3 and 4 in EVGA however I have another issue that once I save the scene and open it again it will then use some other combination of GPU's even though the settings under cuda devices in the render globals will be showing 2 and 4 as expected (I know it's doing this as it often picks GPU 1 as one to render on and I get interface lag while rendering and of course EVGA precision tells me). If I uncheck and re check all the GPU's they will come right again.
I did mention this way back in one of the first beta test version threads but JimStar couldn't reproduce it on his system and nobody else has ever reported it so I figured it was some quirk of my system.
T.
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Mon Nov 05, 2012 4:11 am
by p3taoctane
Thanks mate. This really helpful.
I'll check this out tomorrow and if I get anything diff I'll let you know.
Peter
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Mon Nov 05, 2012 10:28 pm
by jmfowler
TBFX wrote:p3taoctane wrote:Is there a way to correlate card numbers with actual slots in your machine. For example if card three always fails does that always correlate to slot number X? In other words when octane loads up does it sequentially load up from slot one to 4 (as an example) and always do so in that order. (Same for evga monitor... I will notice I get a failed card and in EVGA's numbering scheme it is not always card 3 as an example... but it is always only one that fails)
I'm fairly certain that in my system EVGA precision always numbers my cards 1 through 4 from top to bottom. The relative temperatures the cards run at and fan speeds when on automatic certainly indicate this and I just did a quick check by turning each cards fan onto full in turn and checking the airflow out the back so I believe in my system anyway EVGA precision always numbers them this way.
Octane for maya is another story though as the numbering is not consistent with EVGA. I believe for me the mapping is as follows
EVGA ------ OctaneForMaya
1-------------1
2-------------3
3-------------2
4-------------4
So if I set GPU's 2 and 4 to use for rendering in Maya it will use 3 and 4 in EVGA however I have another issue that once I save the scene and open it again it will then use some other combination of GPU's even though the settings under cuda devices in the render globals will be showing 2 and 4 as expected (I know it's doing this as it often picks GPU 1 as one to render on and I get interface lag while rendering and of course EVGA precision tells me). If I uncheck and re check all the GPU's they will come right again.
I did mention this way back in one of the first beta test version threads but JimStar couldn't reproduce it on his system and nobody else has ever reported it so I figured it was some quirk of my system.
T.
yeah - thats how I check which card is which also - switch 1 of your fans at a time to 90% and feel where its coming from.
Its a shame that Otoy will probably never build some code into their renderer that would automatically scale down the clock speed for a failed GPU - I suppose it could make them liable for any perceived damages in the event of a meltdown - but wouldn't it be great if they did - simply allow the clock speed to be lowered from default ( but not increased above it - IE OC'd )
jimstar has water cooled cards which probably don't get above 60' so I do wonder if its heat related because my first frame would render but then my second or third would fail - leaving me to wonder if the heat had built up at that point to affect a chip on the failing card etc....lowering the clock speed would lower the temp slightly, but this is purely guess work. Also to note my 580 and one other 680 have been playing happy together but only when I added another 680 ( so the 580 was sandwiched in) did the problem start occurring...
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Mon Nov 05, 2012 10:41 pm
by p3taoctane
The thing that is most frustrating with me is that it is a different card each time.
It happens also at random times.
Thought it was when I opened up photoshop... nope happens when nothing else is changed.
I put the three cords going onto the power supply unit into separate wall sockets so that no one socket was drawing all three loads.
Still same thing.
Checked temps and they all are consistantly around 80 which (although a little high) is not causing other similar 580 cards in a different box to fail.
Scratching my head on this one....
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Tue Nov 06, 2012 12:03 am
by jmfowler
80 is a bit high for me - I have adjusted my auto fan curve in evga precision so that the maintained temp on the warmest card stays at 70 degrees celsius.
Despite the fact that graphics cards are designed to be able to go to high temps many people on the net seem to agree that maintaining high temps for long periods of time will not be good for components longevity. I have no professional experience with this, I'm only going off other peoples opinions - but many of them seem to spend a lot of time investigating this stuff.
I would try and get your temps down to 70 degrees.
are they the same brand and spec cards in the problem box?
motherboard problem? different case to the other computer? - I ask because of airflow issues, my Coolermaster HAF case is way better for airflow to GPU's than my NZXT phantom.
Other opinions on this?
something worth mentioning - my box with only the 680's has not failed yet - rendering an animation non stop for 2 weeks- cards at 70-72 degrees, my box with 2*680's and a 580 had a repeated problem for 2 days ( only once i added a 2nd 680..weird ) that turned out to be the 580 - which i lowered the clock speed - and voila.
One interesting thing I've noticed is that the 580 card seems to be very susceptible to picking up heat on its back plate - when i had the 680 near its back plate not rendering the 580 was 10 degree's lower than normal - interesting, because the 680's don't seem to pick up as much heat like that.
Re: OctaneRender® for Maya® beta 3.03c [CURRENT]
Posted: Tue Nov 06, 2012 1:44 am
by p3taoctane
Ill work on getting them down in temp. Thanks
Yeah all same card manufacture
The other box has the same cards and same rough temps etc and does not crash out.
Older tyan card though. Will have to see if that is the difference.
have you changed the clock speed ever... and does it alter performance much?