Page 2 of 3

Re: Network render frame will not finish

Posted: Thu Jul 24, 2014 7:04 am
by Karba
Rico_uk wrote:Amazing, that'd be great. Look forward to the next release, hopefully it will sort out those problems. I tend to find that when using the Octane frame buffer it is much more reliable (unsurprisingly) than the Max frame buffer, but there isn't a way to use that to render the final frames. Will it be possible soon to totally bypass the Max frame buffer and save sequences through the Octane render window inside of Max? The data that you see on the Octane frame buffer is really useful and it does seem much more stable.
Max buffer is not reliable just due to a bug. We will fix it soon.

Re: Network render frame will not finish

Posted: Fri Jul 25, 2014 10:39 am
by abstrax
Rico_uk wrote:Hi Abstrax, thanks for the reply, hopefully you can hope to resolve this.

-I've not checked the slave console screens when it freezes, I'll do that this evening when it next freezes. (currently rendering a long sequence.)
- Usually there are only 100 or sample left to go when it gets stuck.
- Happens quite randomly, could be after 3 frames, could be after 500.
- I have not changed any of the sub-sampling numbers from their default.
- I haven't tried in standalone, I'm not too familiar with it.
- I am rendering to a minimum of 8,000 samples, but usually 16,000.

I would like to point out though that this has happened a couple of times even without network rendering, and just using my local machine. It seems to happen less, but I guess that with more GPUs over the network the likely hood of one of them messing something else to cause the error is magnified.
Hi Rico.

Please check out the latest 2.04 release and let me know if the problem is still there. I suspect it is. I saw it happening once this morning, but couldn't reproduce it. From what I could learn from it, the slave has rendered all assigned samples but missed a tonemap and as a consequence didn't send a result with all rendered samples to the master, which then waits forever for the slave to finish. What I couldn't figure out yet is how the slave could get into this state.

To understand the problem better, I enabled some logging which was available only in debug builds so far. To turn on the logging, please copy this file
octane_log_flags.txt
(40 Bytes) Downloaded 254 times
into the same directory as the octane_slave.exe. The next time the slave is started, it will do more extensive logging about network rendering like so:

Code: Select all

[10:19:04.819]             Started logging on 25.07.14 22:19:04
[10:19:04.819]             
[10:19:04.819]             OctaneRender version 2.04 (2040000)
[10:19:04.819]             
[10:19:08.881]             Launching net render slave (2040000) with master 127.0.0.1:21000
[10:19:09.006] netRender : starting net render slave
[10:19:09.006] netRender : we have now 1 active devices
[10:19:55.741] netRender : network connection detected with ID 1
[10:19:55.741] netRender : sending slave version (2040000 / 64bit)
[10:19:55.772] netRender : sending slave info
[10:20:11.381] netRender : resetting slave
[10:20:11.928] netRender : received work assignment { clevel=1, subs=1, spp=0, tint=1 }
[10:20:12.131] netRender : received work assignment { clevel=1, subs=1, spp=2, tint=1 }
[10:20:12.131] netRender : informing render target of new work assignment { clevel=1, subs=1, spp=2, tint=1 }
[10:20:13.475] netRender : queueing render result { spp=2.0, subs=1, clevel=1 }
[10:20:13.506] netRender : received work assignment { clevel=1, subs=1, spp=6, tint=2 }
[10:20:13.506] netRender : informing render target of new work assignment { clevel=1, subs=1, spp=6, tint=2 }
[10:20:15.397] netRender : queueing render result { spp=6.0, subs=1, clevel=1 }
....
It will also log into a file octane_log.txt if the slave process has write permissions to this folder, which is usually not the case if the slave is installed in the Windows program files folder. To make the logging into a file work, you have to move/copy the whole OctaneRender folder out of the program files directory and install the daemon from there. After using the slave and producing the octane_log.txt for the error case, please create a copy of it (it will be overwritten, the next time the slave starts) and send it to me via PM.

Thanks a lot in advance to help us solve this problem.

Cheers,
Marcus

Re: Network render frame will not finish

Posted: Fri Jul 25, 2014 2:19 pm
by coilbook
Rico_uk wrote:Hi Abstrax, thanks for the reply, hopefully you can hope to resolve this.

-I've not checked the slave console screens when it freezes, I'll do that this evening when it next freezes. (currently rendering a long sequence.)
- Usually there are only 100 or sample left to go when it gets stuck.
- Happens quite randomly, could be after 3 frames, could be after 500.
- I have not changed any of the sub-sampling numbers from their default.
- I haven't tried in standalone, I'm not too familiar with it.
- I am rendering to a minimum of 8,000 samples, but usually 16,000.

I would like to point out though that this has happened a couple of times even without network rendering, and just using my local machine. It seems to happen less, but I guess that with more GPUs over the network the likely hood of one of them messing something else to cause the error is magnified.
Hi Rico
You said "I have not changed any of the sub-sampling numbers from their default." What are they for? is it for octane preview window only to speed it up?
Thanks

Re: Network render frame will not finish

Posted: Sun Jul 27, 2014 7:52 am
by Rico_uk
abstrax wrote:
Rico_uk wrote:Hi Abstrax, thanks for the reply, hopefully you can hope to resolve this.

-I've not checked the slave console screens when it freezes, I'll do that this evening when it next freezes. (currently rendering a long sequence.)
- Usually there are only 100 or sample left to go when it gets stuck.
- Happens quite randomly, could be after 3 frames, could be after 500.
- I have not changed any of the sub-sampling numbers from their default.
- I haven't tried in standalone, I'm not too familiar with it.
- I am rendering to a minimum of 8,000 samples, but usually 16,000.

I would like to point out though that this has happened a couple of times even without network rendering, and just using my local machine. It seems to happen less, but I guess that with more GPUs over the network the likely hood of one of them messing something else to cause the error is magnified.
Hi Rico.

Please check out the latest 2.04 release and let me know if the problem is still there. I suspect it is. I saw it happening once this morning, but couldn't reproduce it. From what I could learn from it, the slave has rendered all assigned samples but missed a tonemap and as a consequence didn't send a result with all rendered samples to the master, which then waits forever for the slave to finish. What I couldn't figure out yet is how the slave could get into this state.

To understand the problem better, I enabled some logging which was available only in debug builds so far. To turn on the logging, please copy this file
octane_log_flags.txt
into the same directory as the octane_slave.exe. The next time the slave is started, it will do more extensive logging about network rendering like so:

Code: Select all

[10:19:04.819]             Started logging on 25.07.14 22:19:04
[10:19:04.819]             
[10:19:04.819]             OctaneRender version 2.04 (2040000)
[10:19:04.819]             
[10:19:08.881]             Launching net render slave (2040000) with master 127.0.0.1:21000
[10:19:09.006] netRender : starting net render slave
[10:19:09.006] netRender : we have now 1 active devices
[10:19:55.741] netRender : network connection detected with ID 1
[10:19:55.741] netRender : sending slave version (2040000 / 64bit)
[10:19:55.772] netRender : sending slave info
[10:20:11.381] netRender : resetting slave
[10:20:11.928] netRender : received work assignment { clevel=1, subs=1, spp=0, tint=1 }
[10:20:12.131] netRender : received work assignment { clevel=1, subs=1, spp=2, tint=1 }
[10:20:12.131] netRender : informing render target of new work assignment { clevel=1, subs=1, spp=2, tint=1 }
[10:20:13.475] netRender : queueing render result { spp=2.0, subs=1, clevel=1 }
[10:20:13.506] netRender : received work assignment { clevel=1, subs=1, spp=6, tint=2 }
[10:20:13.506] netRender : informing render target of new work assignment { clevel=1, subs=1, spp=6, tint=2 }
[10:20:15.397] netRender : queueing render result { spp=6.0, subs=1, clevel=1 }
....
It will also log into a file octane_log.txt if the slave process has write permissions to this folder, which is usually not the case if the slave is installed in the Windows program files folder. To make the logging into a file work, you have to move/copy the whole OctaneRender folder out of the program files directory and install the daemon from there. After using the slave and producing the octane_log.txt for the error case, please create a copy of it (it will be overwritten, the next time the slave starts) and send it to me via PM.

Thanks a lot in advance to help us solve this problem.

Cheers,
Marcus
Great thank you, I'll set all of this up tomorrow and send the log file as soon as it freezes. Will this also help with the first frame rendering too dark / too quickly issue? I guess I need to add the text file to both of my slave machines, and not to the master? Also, I guess I need to cancel the render when it has become stuck, otherwise it will never finish, and once cancelled I hope the log file still stores the relevant info?

Re: Network render frame will not finish

Posted: Mon Jul 28, 2014 9:02 am
by Rico_uk
Abstrax, just sent the PM with the log file.

Re: Network render frame will not finish

Posted: Mon Jul 28, 2014 10:31 am
by Augustronic
I have exactly the same problem here.
Me too, I will collect render logs.
See you later...

Re: Network render frame will not finish

Posted: Mon Jul 28, 2014 10:33 am
by Augustronic
P.S.: Killing a slave finished the frame.
Relaunching the slave after saving and continued...

Re: Network render frame will not finish

Posted: Mon Jul 28, 2014 11:08 am
by abstrax
Rico_uk wrote:Abstrax, just sent the PM with the log file.
Thank you very much for the log. Interestingly, the slave rendered all samples it was asked to render, so the problem doesn't seem to be on the slave side.

How many slaves are you using? What was the kernel you were using and can you remember how many samples were missing?

The next step would be to log the slave and the master. The procedure would be the same as before, but now you would place the octane_log_flags.txt also into the directory with the octane.dll. The write permission problem is the same. Can you run 3ds max as administrator to make sure that the plugin can write the octane_log.txt?

Anyway thank you very much for your help so far.

Re: Network render frame will not finish

Posted: Mon Jul 28, 2014 11:49 am
by Augustronic
Rendering with slaves version 2.04 doesn't work for me.
Probably as I still don't have a 2.04 plugin for Cinema 4D.

Re: Network render frame will not finish

Posted: Mon Jul 28, 2014 11:52 am
by abstrax
Augustronic wrote:Rendering with slaves version 2.04 doesn't work for me.
Probably as I still don't have a 2.04 plugin for Cinema 4D.
That's correct. Master and slave version must match exactly.