CUDA 719 Issues !
Moderator: juanjgon
Yesterday, I finally found and fixed a long standing issue with the displacement intersection test, which sometimes can end up in an infinite loop, causing a kernel to never finish. The OS will then eventually pull the plug as you have reported above. The fix will be available with the next release. Obviously, this will only help in scene with displacement mapping. If you experience similar issues in scenes without displacement mapping, please send them to either Juanjo or to us so we can investigate. Thank you.
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
Thanks Abstrax, This scene i was testing fails every time on various machines so i'll be able to let you know (once new build/beta is ready for LW) is that fix helpingabstrax wrote:Yesterday, I finally found and fixed a long standing issue with the displacement intersection test, which sometimes can end up in an infinite loop, causing a kernel to never finish. The OS will then eventually pull the plug as you have reported above. The fix will be available with the next release. Obviously, this will only help in scene with displacement mapping. If you experience similar issues in scenes without displacement mapping, please send them to either Juanjo or to us so we can investigate. Thank you.

--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Hi!
I know i don't have newest build yet but i now wonder is that one even gonna help 'coz this time i had no displacements and it failed again. completely NEW scene, just 11 objects in scene , no displacements, just few grass patches , danedlion and tree/leaves stuff, sunsystem and 360 frames animation.
After successful 19 frames (80-90 sec/frame on 3 GPUs) it had "given up" with CUDA 700 error this time
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> failed to deallocate device memory
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> could not get memory info
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> failed to unload module
00:29:49 (1789.99) * OCTANE API MSG: CUDA error 700 on device 3: an illegal memory access was encountered
00:29:49 (1789.99) * OCTANE API MSG: CUDA error 700 on device 3: an illegal memory access was encountered -> failed to bind device to current thread
00:29:49 (1789.99) * OCTANE API MSG: device 3: failed to initialize resources
00:29:49 (1789.99) * OCTANE API MSG: device 3: failed to recover - giving up and switching device to failed
00:29:49 (1789.99) |
00:29:49 (1789.99) | ++++++++++++++++++++++
00:29:49 (1789.99) | +++ RENDER FAILURE +++ processing the failure callback
00:29:49 (1789.99) | ++++++++++++++++++++++
00:29:49 (1789.99) |
00:29:49 (1789.99) |
00:29:49 (1789.99) | Aborting rendering detected
00:29:49 (1789.99) | >>> Refresh preview window
00:29:49 (1789.99) | >>> Draw preview window information
00:29:49 (1789.99) | >>> Draw preview window status
00:29:49 (1789.99) | >>> Draw preview window progress bar
00:29:49 (1789.99) | >>> Draw preview image
00:29:49 (1789.99) | >>> Draw preview image finished. Releaseing mutex
00:29:49 (1789.99) | >>> Refresh preview window done
00:29:49 (1790.00) | ... Finish Rendering ...
00:29:49 (1790.00) |
00:29:49 (1790.00) | ... wait for the getPreviewImage function ...
00:29:50 (1790.03) * OCTANE API MSG: CUDA error 700 on device 1: an illegal memory access was encountered
00:29:50 (1790.03) * OCTANE API MSG: CUDA error 700 on device 1: an illegal memory access was encountered -> failed to bind device to current thread
00:29:50 (1790.03) * OCTANE API MSG: device 1: failed to initialize resources
00:29:50 (1790.03) * OCTANE API MSG: device 1: failed to recover - giving up and switching device to failed
00:29:50 (1790.04) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered
00:29:50 (1790.04) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> failed to bind device to current thread
00:29:50 (1790.04) * OCTANE API MSG: device 2: failed to initialize resources
00:29:50 (1790.04) * OCTANE API MSG: device 2: failed to recover - giving up and switching device to failed
00:29:50 (1790.10) | ... reset image callback ...
00:29:50 (1790.10) | Close and free scene, free buffer: 1, reset scene: 1
00:29:50 (1790.10) | ... freeBuffers
00:29:50 (1790.10) | ... setRenderTargetNode(NULL)
00:29:50 (1790.14) | ... update()
00:29:50 (1790.14) | ... getRootNodeGraph()->clear()
00:29:50 (1790.23) | >>> Refresh preview window
00:29:50 (1790.23) | >>> Draw preview window progress bar
00:29:50 (1790.23) | >>> Refresh preview window done
00:29:50 (1790.23) | Scene closed
00:29:50 (1790.23) |
00:29:50 (1790.23) | Close preview window
00:29:50 (1790.23) |
00:29:50 (1790.23) | Octane Render for Lightwave, end of log system
I know i don't have newest build yet but i now wonder is that one even gonna help 'coz this time i had no displacements and it failed again. completely NEW scene, just 11 objects in scene , no displacements, just few grass patches , danedlion and tree/leaves stuff, sunsystem and 360 frames animation.
After successful 19 frames (80-90 sec/frame on 3 GPUs) it had "given up" with CUDA 700 error this time

00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> failed to deallocate device memory
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> could not get memory info
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered
00:29:34 (1774.26) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> failed to unload module
00:29:49 (1789.99) * OCTANE API MSG: CUDA error 700 on device 3: an illegal memory access was encountered
00:29:49 (1789.99) * OCTANE API MSG: CUDA error 700 on device 3: an illegal memory access was encountered -> failed to bind device to current thread
00:29:49 (1789.99) * OCTANE API MSG: device 3: failed to initialize resources
00:29:49 (1789.99) * OCTANE API MSG: device 3: failed to recover - giving up and switching device to failed
00:29:49 (1789.99) |
00:29:49 (1789.99) | ++++++++++++++++++++++
00:29:49 (1789.99) | +++ RENDER FAILURE +++ processing the failure callback
00:29:49 (1789.99) | ++++++++++++++++++++++
00:29:49 (1789.99) |
00:29:49 (1789.99) |
00:29:49 (1789.99) | Aborting rendering detected
00:29:49 (1789.99) | >>> Refresh preview window
00:29:49 (1789.99) | >>> Draw preview window information
00:29:49 (1789.99) | >>> Draw preview window status
00:29:49 (1789.99) | >>> Draw preview window progress bar
00:29:49 (1789.99) | >>> Draw preview image
00:29:49 (1789.99) | >>> Draw preview image finished. Releaseing mutex
00:29:49 (1789.99) | >>> Refresh preview window done
00:29:49 (1790.00) | ... Finish Rendering ...
00:29:49 (1790.00) |
00:29:49 (1790.00) | ... wait for the getPreviewImage function ...
00:29:50 (1790.03) * OCTANE API MSG: CUDA error 700 on device 1: an illegal memory access was encountered
00:29:50 (1790.03) * OCTANE API MSG: CUDA error 700 on device 1: an illegal memory access was encountered -> failed to bind device to current thread
00:29:50 (1790.03) * OCTANE API MSG: device 1: failed to initialize resources
00:29:50 (1790.03) * OCTANE API MSG: device 1: failed to recover - giving up and switching device to failed
00:29:50 (1790.04) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered
00:29:50 (1790.04) * OCTANE API MSG: CUDA error 700 on device 2: an illegal memory access was encountered -> failed to bind device to current thread
00:29:50 (1790.04) * OCTANE API MSG: device 2: failed to initialize resources
00:29:50 (1790.04) * OCTANE API MSG: device 2: failed to recover - giving up and switching device to failed
00:29:50 (1790.10) | ... reset image callback ...
00:29:50 (1790.10) | Close and free scene, free buffer: 1, reset scene: 1
00:29:50 (1790.10) | ... freeBuffers
00:29:50 (1790.10) | ... setRenderTargetNode(NULL)
00:29:50 (1790.14) | ... update()
00:29:50 (1790.14) | ... getRootNodeGraph()->clear()
00:29:50 (1790.23) | >>> Refresh preview window
00:29:50 (1790.23) | >>> Draw preview window progress bar
00:29:50 (1790.23) | >>> Refresh preview window done
00:29:50 (1790.23) | Scene closed
00:29:50 (1790.23) |
00:29:50 (1790.23) | Close preview window
00:29:50 (1790.23) |
00:29:50 (1790.23) | Octane Render for Lightwave, end of log system
--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Abstrax, Coudl that CUD A700 (illegaly memory..." be a Pascla GPU only issue ?
I have scene which fails on Pascal GPUs after few frames/minutes and yet it renders fine on Maxwell GPUs for hour or longer i tested ?? Also renders fine in octane 2.25 but i can't use Pascals there to check is it octane 3.03x only or in general.
Could it be because "Experimental Pascal Support" ?
Thanks
I have scene which fails on Pascal GPUs after few frames/minutes and yet it renders fine on Maxwell GPUs for hour or longer i tested ?? Also renders fine in octane 2.25 but i can't use Pascals there to check is it octane 3.03x only or in general.
Could it be because "Experimental Pascal Support" ?
Thanks
- Attachments
-
- octane.log
- (264.09 KiB) Downloaded 254 times
--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Yes, could be, but it's more likely either a bug in the kernels or a compiler issue with the RC of the CUDA 8 toolkit. If you've got a scene that allows you to reproduce the problem in the Standalone, you can already test it with 3.03.4 otherwise you would have to wait until an update of the Lightwave plugin based on 3.03.4 has been released. Maybe send me a PM when you know more. We are certainly keen to fix them as soon as possible, but would need to be able to reproduce it somehow here.Lewis wrote:Abstrax, Coudl that CUD A700 (illegaly memory..." be a Pascla GPU only issue ?
I have scene which fails on Pascal GPUs after few frames/minutes and yet it renders fine on Maxwell GPUs for hour or longer i tested ?? Also renders fine in octane 2.25 but i can't use Pascals there to check is it octane 3.03x only or in general.
Could it be because "Experimental Pascal Support" ?
Thanks
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
Thanks Abstrax.
I've sent it to Juanjo but he don't have Pascal GPUs (when i sent it to him i wasn't sure it's Pascal only) so i'll see with more tests/LW users who has Pascal GPUs.
It's animation scene so not sure how to test animation in Standlaone ? Is there option somewhere to set frame range in standalone to let it run animation
? I'm such newbie in Standalone app, sorry.
I've sent it to Juanjo but he don't have Pascal GPUs (when i sent it to him i wasn't sure it's Pascal only) so i'll see with more tests/LW users who has Pascal GPUs.
It's animation scene so not sure how to test animation in Standlaone ? Is there option somewhere to set frame range in standalone to let it run animation

--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Hi !
Good GPU monitoring software, may help for checking hardware : http://www.ozone3d.net/gpushark/
One of my titan is dead, this is the end of 6 months of bug, freeze & reboot ! Near 160 hours of process without bug !!!
Have a nice day !
_
Good GPU monitoring software, may help for checking hardware : http://www.ozone3d.net/gpushark/
One of my titan is dead, this is the end of 6 months of bug, freeze & reboot ! Near 160 hours of process without bug !!!

Have a nice day !
_
- BorisGoreta
- Posts: 1413
- Joined: Fri Dec 07, 2012 6:45 pm
- Contact:
I burned through 4 GPUs already 

19 x NVIDIA GTX http://www.borisgoreta.com
Hi, Abstraxabstrax wrote:Yes, could be, but it's more likely either a bug in the kernels or a compiler issue with the RC of the CUDA 8 toolkit. If you've got a scene that allows you to reproduce the problem in the Standalone, you can already test it with 3.03.4 otherwise you would have to wait until an update of the Lightwave plugin based on 3.03.4 has been released. Maybe send me a PM when you know more. We are certainly keen to fix them as soon as possible, but would need to be able to reproduce it somehow here.Lewis wrote:Abstrax, Coudl that CUD A700 (illegaly memory..." be a Pascla GPU only issue ?
I have scene which fails on Pascal GPUs after few frames/minutes and yet it renders fine on Maxwell GPUs for hour or longer i tested ?? Also renders fine in octane 2.25 but i can't use Pascals there to check is it octane 3.03x only or in general.
Could it be because "Experimental Pascal Support" ?
Thanks
was this issue solved "an illegal memory access was encountered"
we have a set up with 7 GPUs all pascal (6 titan x and 1 1080 - 3 titans on master and the rest on slave) and windows 10 we never had this problem before especially until we switched to windows 10 (slave still uses windows 7). Can i send you the scene somehow We cannot even render half of the frame where get the error especially when camera looks down on the scene Thanks
P.S. All these problems started when our main PC failed and we switched to windows 10. Also our new scene has new itoo forest 5.1 with lots of trees Not sure if it is connected somehow
Last edited by coilbook on Fri Oct 07, 2016 6:04 am, edited 1 time in total.
That would be nice to nail down if you have scene that's 100% showing bug/issue. Please send/upload it for Otoy.
Last edited by Lewis on Fri Oct 07, 2016 6:38 am, edited 1 time in total.
--
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090
Lewis
http://www.ram-studio.hr
Skype - lewis3d
ICQ - 7128177
WS AMD TRPro 3955WX, 256GB RAM, Win10, 2 * RTX 4090, 1 * RTX 3090
RS1 i7 9800X, 64GB RAM, Win10, 3 * RTX 3090
RS2 i7 6850K, 64GB RAM, Win10, 2 * RTX 4090