Thank you all for helping to track down this problem. With this driver fix, we assume that any additional out-of-memory errors are either caused by insufficient resources or non-optimal memory management in Octane. We are working on to improving the latter.
Hi all,
We are still working on trying to get the CUDA error 2 (out of memory) failures resolved. Since now a lot more people will start using version 4, I thought it's a good idea to give an update about where we are with this issue:
Fundamentally there are two very different scenarios we have to distinguish:
- In the first case it's a legitimate error message that we have run out of memory. This can happen either when we try to allocate some memory on the device or when we try to pin some memory on host. Both resources (i.e. VRAM and pinnable memory) are limited and we can run of it. In this case you should see some error message that some memory could not been allocated or pinned.
In the past things were fairly simple, because when you ran out of memory that was that and there is nothing you can do anyway. But now with out-of-core support, Octane could at least try to move more data back to system memory. Since version 4 RC 7 we have already made some changes to reduce this problem and we are trying to increase robustness in the memory management, but aren't finished with that yet since it's not that trivial. What makes it all even harder to solve is the fact that the numbers about the free memory reported by CUDA are not very consistent. We hope to have at least the re-allocation implemented in the next release.
Another thing we are trying to improve is to reduce the amount of pinned memory we use in Octane.
In a nutshell, this scenario should be solvable and I hope we have it sorted soon. - The second situation is unfortunately a lot trickier, and it seems to be the more common problem: What happens is that a CUDA operation fails with an out-of-memory error, although there is a) more than enough memory and b) this operation actually doesn't allocate any memory anyway. It also always screws up the CUDA context and is not recoverable, i.e. requires the application to restart.
At the moment I believe that this is a CUDA driver problem, which I was able to reproduce under very specific circumstances. The reproducible cases have always been observed first in the C4D plugin, but I was always able to reproduce them in the Standalone as long as the following criteria applied:- CINEMA 4D is running in the background (it doesn't have to have a scene loaded and it doesn't have to have the Octane plugin installed).
- The operating system is either Windows 7 or 8, i.e. the graphics driver is a WDDMv1 driver. (We weren't able to reproduce the issue on Windows 10 yet)
- The used graphics card is the first CUDA GPU that is not using the TCC driver mode.
- The used GPU is a Pascal GPU (1070, 1080, Titan Pascal).
If you think your case is reproducible and fits scenario 2), please send the scene and report to me. I currently have 4 scenes where I can reproduce the issue here and maybe with more scenes I can start seeing a pattern. Especially if you've got a reproducible CUDA failure that either doesn't involve C4D or Windows 7 or 8, it might be very helpful since this would indicate that it is a more generic problem.
Although I currently believe that it is a CUDA driver issue, it is very well possible that it is a bug somewhere in Octane that has this error happening at some later time. So more scenes might help in trying to find some commonality, too and might help pointing us to any issues we have in Octane.
-> Please be patient with us, we are definitely working on it and I have been banging my head at it for almost 2 months now. And any help / information is more than welcome.
Thank you,
Marcus