Page 1 of 1

Anything being done about 700 error

Posted: Thu Jan 12, 2017 12:30 am
by mikeadamwood
I'm in no better place since last time I posted and there's many threads regarding this same issue or similar with pascal cards. If I'm working with live viewer I get this error after abit, if I render to picture viewer it freezes part way in. It's unusable.

-> failed to deallocate device memory
CUDA error 700 on device 0: an illegal memory access was encountered


- I've rolled back drivers
- reinstalled different versions of plugin and standalone
- tried in Cinema 4D r17 and r18

Always come into same issue.

I've also tried to reproduce the problem in standalone, couldn't.
Grabbed Octane for 3Ds max demo also, spent hours trying to reproduce but couldn't. Seems that they run perfectly on my machine with materials/high res textures and complex scene, problem seems to only be with C4D plugin..


I also sent my machine back to the builders who stress tested each component and it flew through all the tests with no hiccups.

Aoktar do you even know about my correspondence with support and is anything being done here?

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 12:35 am
by aoktar
My words will not change since my latest post on your other thread. Test it with demo version of C4D plugin.

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 1:03 am
by mikeadamwood
aoktar wrote:My words will not change since my latest post on your other thread. Test it with demo version of C4D plugin.
But my other thread went cold after I posted the log...I will try with demo.

Is there a lot of code differences between Octane VR plugin and full license?

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 1:30 am
by aoktar
mikeadamwood wrote:
aoktar wrote:My words will not change since my latest post on your other thread. Test it with demo version of C4D plugin.
But my other thread went cold after I posted the log...I will try with demo.

Is there a lot of code differences between Octane VR plugin and full license?
For plugin, there's not one line difference. All changes comes from SDK.

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 2:17 am
by abstrax
I basically spent the last month investigating these kind of issues. Unfortunately I couldn't find any actual culprits, but observed various problems with the CUDA 7 / 7.5 and 8 compilers usually resulting in inexplicable kernel crashes which - when debugged - didn't make any sense, i.e. look like compiler problems. I think / hope to have found workarounds for all of these.

I also noticed that most of the CUDA error reports come from users with Pascal hardware and I wonder if there is a general problem with that, but it's really hard to tell. I haven't observed any of the reported sporadic problems with our Pascal hardware yet.

Another thing I have noticed is that the CUDA errors often happen on scenes with very big geometry. In those cases the ray tracing will access the geometry data in a different way than usual. Unfortunately this access method is very unforgiving for invalid memory access. So there may be a chance that there is a problem somewhere in our geometry data structures, but I couldn't find anything yet.

So the plan forward is to release a test build of 3.06 with new features and various changes to address the problems above. This build will also be compiled completely in CUDA 8. Hopefully this will improve things if not I will keep digging.

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 12:12 pm
by mikeadamwood
abstrax wrote:I basically spent the last month investigating these kind of issues. Unfortunately I couldn't find any actual culprits, but observed various problems with the CUDA 7 / 7.5 and 8 compilers usually resulting in inexplicable kernel crashes which - when debugged - didn't make any sense, i.e. look like compiler problems. I think / hope to have found workarounds for all of these.

I also noticed that most of the CUDA error reports come from users with Pascal hardware and I wonder if there is a general problem with that, but it's really hard to tell. I haven't observed any of the reported sporadic problems with our Pascal hardware yet.

Another thing I have noticed is that the CUDA errors often happen on scenes with very big geometry. In those cases the ray tracing will access the geometry data in a different way than usual. Unfortunately this access method is very unforgiving for invalid memory access. So there may be a chance that there is a problem somewhere in our geometry data structures, but I couldn't find anything yet.

So the plan forward is to release a test build of 3.06 with new features and various changes to address the problems above. This build will also be compiled completely in CUDA 8. Hopefully this will improve things if not I will keep digging.
Thanks abstrax! Your information means a lot and it's great to know something is being done.

My workplace has also moved to an Octane machine because of my preaching, but unforutunly having same issue, only difference is they have 2 Titan Pascal cards.

The scenes I'm working with are pretty simple to be honest, I come from a game modelling background so I'm pretty careful when it comes to polycounts.

I have noticed better result from rendering to 720p to picture viewer, my render failed before on 1080p but seems abit more stable at lower resolution.

If there's any information I can provide please let me know, I have lots of crash logs and images I can provide if needed. When should we expect the update?

Thanks.

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 10:50 pm
by abstrax
mikeadamwood wrote:
abstrax wrote:I basically spent the last month investigating these kind of issues. Unfortunately I couldn't find any actual culprits, but observed various problems with the CUDA 7 / 7.5 and 8 compilers usually resulting in inexplicable kernel crashes which - when debugged - didn't make any sense, i.e. look like compiler problems. I think / hope to have found workarounds for all of these.

I also noticed that most of the CUDA error reports come from users with Pascal hardware and I wonder if there is a general problem with that, but it's really hard to tell. I haven't observed any of the reported sporadic problems with our Pascal hardware yet.

Another thing I have noticed is that the CUDA errors often happen on scenes with very big geometry. In those cases the ray tracing will access the geometry data in a different way than usual. Unfortunately this access method is very unforgiving for invalid memory access. So there may be a chance that there is a problem somewhere in our geometry data structures, but I couldn't find anything yet.

So the plan forward is to release a test build of 3.06 with new features and various changes to address the problems above. This build will also be compiled completely in CUDA 8. Hopefully this will improve things if not I will keep digging.
Thanks abstrax! Your information means a lot and it's great to know something is being done.

My workplace has also moved to an Octane machine because of my preaching, but unforutunly having same issue, only difference is they have 2 Titan Pascal cards.

The scenes I'm working with are pretty simple to be honest, I come from a game modelling background so I'm pretty careful when it comes to polycounts.

I have noticed better result from rendering to 720p to picture viewer, my render failed before on 1080p but seems abit more stable at lower resolution.

If there's any information I can provide please let me know, I have lots of crash logs and images I can provide if needed. When should we expect the update?

Thanks.
Hmm the fact that it seems to be resolution dependent makes me wonder if it's a problem with tone mapping which needs to transfer the film from system to GPU and back which is done via pinning of memory. Could you check if disabling/enabling tone mapping for one of the devices? If you are using post-effects like bloom or glare, does it make a difference if you disable it?

Could you also try different graphics driver versions to see if the problem goes away?

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 11:21 pm
by mikeadamwood
abstrax wrote: Hmm the fact that it seems to be resolution dependent makes me wonder if it's a problem with tone mapping which needs to transfer the film from system to GPU and back which is done via pinning of memory. Could you check if disabling/enabling tone mapping for one of the devices? If you are using post-effects like bloom or glare, does it make a difference if you disable it?

Could you also try different graphics driver versions to see if the problem goes away?
I've tried that and the problem persists. I've even installed R17 to see if it works and I got to frame 3 of an animation and it froze, I had to force quit Cinema. I was unable to get a log in this case.

We have a deadline in 2 weeks at work and the company invested £6000 on the 2x Titan X Pascal build to render the animation, at the moment we are pretty screwed.

Re: Anything being done about 700 error

Posted: Thu Jan 12, 2017 11:47 pm
by aoktar
Try to decrease frequency of gpus. What's your gpus? Brand/model.
Try it on VR 3.04.5 or demo versions(3.05.3 and 3.04.5). Try to render at lower samples and lower resolution.
Finally try to export as orbx whole timeline or some and try to render in standalone.

Re: Anything being done about 700 error

Posted: Fri Jan 13, 2017 12:59 pm
by mikeadamwood
aoktar wrote:Try to decrease frequency of gpus. What's your gpus? Brand/model.
Try it on VR 3.04.5 or demo versions(3.05.3 and 3.04.5). Try to render at lower samples and lower resolution.
Finally try to export as orbx whole timeline or some and try to render in standalone.
I have x2 Evga GTX 1080 Hybrid watercooled cards.

I tried export to animated orbx from Cinema to standalone but got this error
IMG_0673.PNG
It still opened in standalone however, just can't for the life of me work out how to render the animation.

It seems that every error I get relates to memory in some way or another.