Quadro/Tesla better for Octane than geforce?

Fri May 20, 2011 10:52 pm

I just want to ask (out of curiousity), if the Geforce´s limited/crippled dual precision computing performance compared to Teslas/quadros does have influence on Octane performance? I mean high-end geforces are capped AFAIK at 1/8 of dual precision power of Tesla, does this mean that Tesla with same number of CUDA cores as Geforce is 8x faster? Or this dual precision thing is irrelevant for Octane?

Another question, with CUDA 4.0 there seems to be some new feature, which allows copy/move content from memory of one core to the memory of another core (on dual GPU cards via the NF200 bridge), if understood that correctly? Will it affect Octane in any form/ improve its performance?

EDIT: I suppose i was not clear enough on the second question, so this is, what i was talking about:

CUDA 4.0 removes this burden and gives developers a unified memory base to work with. Sanford Russell, director of CUDA marketing, said that this new capability allows for “peer-to-peer memory access within the node.

"If you have two GPUs in a node, the way this worked prior to 4.0, you literally had to copy the objects to main memory through the CPU, then copy it back out and put it on GPU No. 2. There were a whole bunch of extra steps required. We now have a peer-to-peer capability, where it literally is a copy from memory to memory. It goes across the PCIX bus, and it’s no longer going to system memory."

Sat May 21, 2011 5:11 pm

In answer to your questions

1) No.Octane operates on as many Cuda Cores as you have on your Card , then its down to as much ram as possible to fit your scene in.I'm sure that a GTX580 for example will operate as fast an Equivalent spec. Tesla card. The only issue is Running GL apps like 3D max and Autocad ETC is crippled on the drivers for Geforce cards.How else are they expected to sell very expensive Tesla cards otherwise.

I'm considering running an ATI card for Display and my GTX 460 dedicated to Octane.

2) I asked the same question when i saw the specs on Cuda 4. No you cannot move the scene around between various cards memory or say split the scene between various cards. EG 2 x GTX 580 - 3GB models = 6GB for a scene.The whole scene needs to fit on to the card.Apparently Radiance said that this is down to needing very low latency between memory and the GPU . Otherwise Octane will slow to a snails pace.The only way for Octane to perform is to have the scene loaded in the Graphics Cards memory to do this.

Hope that answers your questions.

The Jabba

Mon May 23, 2011 9:17 am

Now the real hybrid solution could probably happen if the speed of PCIe could get really efficient, and maybe some manufacturer produces a motherboard specifically designed for GPU rendering, you would really think the upload from internal RAM to GPU would be real quick, but I guess it's that part of the upload that happens after voxelizing, or when you import a mesh and then click on it.. takes a while.. I think when manufacturers realise this, something new could potentialy happen, maybe the GPGPU processing market becomes real mainstream.

Mon May 23, 2011 10:17 am

Of course , thinking right outside the box , the ideal situation sometime in the future would be a motherboard loaded with multi GPU's The CPU would be redundant.

All programs would be written to run on the GPUs.The motherboard would have just memory and GPU's to run the display and all the programs.The Memory would be just one large pool that is drawn down as required for both.

Mon May 23, 2011 11:33 am

OK, thanks for answers. Regarding the first question though, does it mean, that double precision computing power is irrelevant within Octane? Does Octane compute only with single-precision stuff? Or its more complicated and my question is flawed itself? I am no math or IT expert, so my understanding of these things is very "oberflachlich" LOL
I believe there will be some change, in how things work, when the Nvidias GPU with its own ARM CPU onboard... maybe we will need only PSU and GPU then