Hi, I'm not sure if this is specific to Cinema or to Octane in general, but we are finding that quite often a slave will crash at some point during rendering animations in the Picture Viewer, and the frame will then never complete. As soon as the crashed slave is removed, the frame completes and starts the next one.
This creates problems when running overnight, as you can have a frame freeze at 2am and only find out in the morning that the time was wasted.
The slaves are a mix of 1080ti and 980ti in pairs (8 slaves, 16 cards total), we have tested as much as we can and cannot find a specific trigger. It's not memory, sometimes a 1080ti will fail when the 980ti are fine.
It seems to happen most often when a slave joins a job part way through, but sometimes a slave will fail that has been working since the beginning of the job and cause the same problem.
Removing and re-adding the slave works, so there's no consistency there either. All slaves are running the same drivers: 382.33 and Octane 3.06 stable.
As you can see, a slave crashed on the 5th frame, and the frame paused at 99.97% for 14 hours. When the slave was restarted everything ran fine again.
In this case, the error was CUDA 719, but this is not consistent - it sometimes be an error of "not receiving all information for a frame", or sometimes simply say the "slave crashed or was stopped (CTRL-C)"
I understand there are all sorts of variables that could affect rendering so I'm not looking for a specific fix, but rather:
Is there any way to automatically force a slave daemon to quit if it fails - so that it is removed from the pool and the frame will finish?
I'd rather that the job continued a little slower overnight than froze for hours at a time.
Thanks in advance,
James
Automatically close failed slave?
Moderators: ChrisHekman, aoktar
Hi guys,
unfortunately, I have tried several times to reproduce this issue without success.
If the Slave crashes, the Master continue to render here.
If the issue is not clearly reproducible, is very difficult for the developers to find the culprit.
If you could find a scene that always behaves in this way, please, share with us.
ciao beppe
unfortunately, I have tried several times to reproduce this issue without success.
If the Slave crashes, the Master continue to render here.
If the issue is not clearly reproducible, is very difficult for the developers to find the culprit.
If you could find a scene that always behaves in this way, please, share with us.
ciao beppe
Hi,
It's definitely not Cinema4D related. I have same issue sin LightWave network rendering through Octane controller.
My topic is here but sadly no answer/news form OTOY
viewtopic.php?f=23&t=63777&p=325122#p325122
It's definitely not Cinema4D related. I have same issue sin LightWave network rendering through Octane controller.
My topic is here but sadly no answer/news form OTOY

viewtopic.php?f=23&t=63777&p=325122#p325122
--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
I've managed to make a scene that uses just a bit too much memory for the 6gb cards and this does reproduce the error - but not consistently.
Most of the time the slaves fail and the render continues, but sometimes - like in this screenshot - the render gets stuck.
You can see that:
The slaves don't show as failed (three of them did because of lack of memory)
Nothing shows in the log
The render is stuck - it's been 5 seconds from finishing the frame for over 15 minutes.
On the three slaves that failed, this is the error:
While I have managed to force this to happen, as mentioned previously it's not the same error every time. It doesn't seem to be related to any single issue, but occasionally the slave that crashes doesn't seem to talk to the machine in charge of rendering, which waits indefinitely for a result that's not coming.
What would be great is if there could be a time-out set on the master, so if it receives no result in a set amount of time - 2 minutes for example - it excludes the slave and carries on, or even a command-line instruction on the slave daemon to quit if it encounters an error. Is that possible?
Most of the time the slaves fail and the render continues, but sometimes - like in this screenshot - the render gets stuck.
You can see that:
The slaves don't show as failed (three of them did because of lack of memory)
Nothing shows in the log
The render is stuck - it's been 5 seconds from finishing the frame for over 15 minutes.
On the three slaves that failed, this is the error:
While I have managed to force this to happen, as mentioned previously it's not the same error every time. It doesn't seem to be related to any single issue, but occasionally the slave that crashes doesn't seem to talk to the machine in charge of rendering, which waits indefinitely for a result that's not coming.
What would be great is if there could be a time-out set on the master, so if it receives no result in a set amount of time - 2 minutes for example - it excludes the slave and carries on, or even a command-line instruction on the slave daemon to quit if it encounters an error. Is that possible?
Octane 3.07.R2 | Cinema 4D 18.057
Windows 7 64bit | 64GB RAM | 8x Geforce GTX 1080ti/8x Asus Strix GTX980ti
Windows 7 64bit | 64GB RAM | 8x Geforce GTX 1080ti/8x Asus Strix GTX980ti
- rleuchovius
- Posts: 51
- Joined: Sun May 03, 2015 1:47 pm
We have the same issue at our studio from time to time, the render stops at 99% of one frame and then gets stuck, halting the rest of the night render. Some sort of time out function would be highly appreciated.
Just to let you know: I didn't have time yet, but will have an in-depth look into the reported problem next week. If there is any more information that might be relevant, feel free to add it to this thread. Thanks a lot.
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
James, I just noticed that you are using version 3.06. Could you (when you've got time) update to version 3.07 to see if the problem is gone. I don't think that the update will solve your problem, but it's worth a try. Thanks.
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
You are right, 3.07 will not fix this.abstrax wrote:James, I just noticed that you are using version 3.06. Could you (when you've got time) update to version 3.07 to see if the problem is gone. I don't think that the update will solve your problem, but it's worth a try. Thanks.
I was using 3.07.1 (LW verison) but had same issue, if one slave dies (for whatever reason) then rest just stops and waits untill i hit continue

Thanks
--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Same issue here with slave crash. Awaiting response. thanks