Cards Failing

Generic forum to discuss Octane Render, post ideas and suggest improvements.
Forum rules
Please add your OS and Hardware Configuration in your signature, it makes it easier for us to help you analyze problems. Example: Win 7 64 | Geforce GTX680 | i7 3770 | 16GB
Post Reply
Coolcat007
Licensed Customer
Posts: 5
Joined: Thu Oct 04, 2012 7:43 pm

Since yesterday I have a new rendering system consisting of the following hardware:

i5 3570
2x EVGA 660ti
12gb ddr3
samsung 840 series ssd

I have SLI disconnected and disabled.
The installed Nvidia driver is 314.07 with CUDA toolkit 5.0


The problem I have is that the cards randomly fail. If i use only 1 at a time I get the same result.
When I then try to close Octane I get a notice of a vga driver hang/recovery.

I have tested octane v1.10 final and the v1.11RC.
Should I install an older Nvidia driver? This is a very annoying problem.
Coolcat007
Licensed Customer
Posts: 5
Joined: Thu Oct 04, 2012 7:43 pm

I tested a little more, and it seems the problem only happens when I use my second card.
Are there any known problems where only the main card used by the OS works without a problem?
Otherwise I will need to switch out both cards and I want to delay that as much as I can.
kavorka
Licensed Customer
Posts: 1351
Joined: Sat Feb 04, 2012 6:40 am

did you look into the log file?
(help > open log)

Might give you some insight, or allow a dev to diagnose.

Other than that, is your VRAM filled? that causes my cards to fail sometimes. Other than that, I don't know.
Did you try running it with just the 2nd card to make sure the card isn't defective?
Intel quad core i5 @ 4.0 ghz | 8 gigs of Ram | Geforce GTX 470 - 1.25 gigs of Ram
Coolcat007
Licensed Customer
Posts: 5
Joined: Thu Oct 04, 2012 7:43 pm

There seems to be a lot of error 999: unknown internal error


################################################################################
Started logging on 26.02.13 15:19:48
################################################################################

CUDA error 999 on device 0: An unknown internal error occurred.
-> Kernel execution failed (dl)
CUDA device 0: Direct lighting failed
CUDA error 999 on device 0: An unknown internal error occurred.
-> Failed to copy memory to device.
CUDA device 0: Failed to load data of data texture 20 of context 0 onto device
CUDA error 999 on device 0: An unknown internal error occurred.
-> Failed to deallocate device array
CUDA error 999 on device 0: An unknown internal error occurred.
-> Could not get memory info
CUDA device 0: Failed to update daylight data
CUDA error 999 on device 0: An unknown internal error occurred.
-> Failed to allocate device array
CUDA device 0: Failed to reallocate data texture 20 of context 0
CUDA device 0: Failed to update daylight data
CUDA error 999 on device 0: An unknown internal error occurred.
-> Failed to allocate device array
CUDA device 0: Failed to reallocate data texture 20 of context 0
CUDA device 0: Failed to update daylight data
CUDA error 999 on device 0: An unknown internal error occurred.
-> Failed to allocate device array




.. this goes on for quite a while

VRAM is almost empty. This even happens on just a scene with a simple cube
badmilk69
Licensed Customer
Posts: 122
Joined: Sat Sep 03, 2011 4:24 am

Try your system only with your 2nd card plugged, if crash is a faulty card.
i7 2600, 16 GB RAM, 2x Evga 670 SC 4gb, dual boot win7/osxML, 2 SSDs, 3 HDs.
User avatar
face_off
Octane Plugin Developer
Posts: 15717
Joined: Fri May 25, 2012 10:52 am
Location: Adelaide, Australia

I've some investigating into error 999. It "appears" to be a card failure (so it would be great to see if you pull each card out separately, if it's only the one card that gives the error). To recover from 999, you can sometimes reset the card (ie. disable and the enable then in the Cuda Devices window). I often get 999 after Octane shutdown unexpectedly. Rendering to a viewport smaller than 512 x 512 often gets things started again, and then I can increase the rendersize once the card is rendering.
Win7/Win10/Mavericks/Mint 17 - GTX550Ti/GT640M
Octane Plugin Support : Poser, ArchiCAD, Revit, Inventor, AutoCAD, Rhino, Modo, Nuke
Pls read before submitting a support question
User avatar
FooZe
OctaneRender Team
Posts: 1335
Joined: Tue May 15, 2012 9:00 pm

Might pay to just check your power supply wattage and make sure the cables going to the cards are plugged in properly.

Chris.
User avatar
face_off
Octane Plugin Developer
Posts: 15717
Joined: Fri May 25, 2012 10:52 am
Location: Adelaide, Australia

My latest theory on this is that the card has not sufficiently got to the correct operating clock speed on the Core, Memory or Shader clock when the render starts.

If you use TechPowerUp GPU-Z you can see the clock speeds. Load your scene into Octane Standalone, then click on a simple, single node (starts the render), then click pause. If you check GPU-Z (middle tab) you can see the clocks now at their operating levels. If you have a faulty card, this tool may help identify if one of the clocks has not started.

It's only a theory though...

Paul
Win7/Win10/Mavericks/Mint 17 - GTX550Ti/GT640M
Octane Plugin Support : Poser, ArchiCAD, Revit, Inventor, AutoCAD, Rhino, Modo, Nuke
Pls read before submitting a support question
Coolcat007
Licensed Customer
Posts: 5
Joined: Thu Oct 04, 2012 7:43 pm

The cards are plugged in properly. When I isolated 1 card in the system I found 1 of the 2 cards failing. I returned it for a new one. Last friday I worked on both cards at the same time, no problem.
When I try to render today, I get 1 card failing again. The new one (in the 2nd PCIE connector).
I saw that one pcie connector is stated as 16X while the other one is displayed as 4X. It is the 4x one that's failing. Could this be the problem?
kavorka
Licensed Customer
Posts: 1351
Joined: Sat Feb 04, 2012 6:40 am

did you try switching card positions?
also having just one card, but put it in the 4x slot?
Intel quad core i5 @ 4.0 ghz | 8 gigs of Ram | Geforce GTX 470 - 1.25 gigs of Ram
Post Reply

Return to “General Discussion”