GPUDirect to enable shared VRAM?

Thu Jul 05, 2012 1:03 pm

Nvidia published the preview for Cuda 5 a few weeks ago, one of the new features is GPUDirect which enables GPUs to access the VRAM of other GPUs.
Shouldn't it be possible then to stack VRAM in a system instead of being limited to the card with the least VRAM?

http://developer.nvidia.com/gpudirect

Edit: It seems like it works even via networks, means we could even use the GPUs from machines in the network?

This would be a huge request by us, if this could be easily implented with the CUDA 5 into Octane, PLEASE DO THIS. It would save us a lot of money.

Fri Jul 06, 2012 3:27 pm

Wow, ya, huge. The people running 4 gtx580s would have access to 12 GB instead of 3, or they could buy the cheaper 1.5 Gb ver and then have 6 GB, saving them around 400 USD and getting twice the ram they would have with the quad 3 GB.

but, maybe Octane doesnt work the same way it explains in the press release? Still, im excited.

Sat Jul 07, 2012 1:22 pm

It seems like the P2P acess for GPUs is here since Cuda 4.0. Looks like this isn't very easy to implent or not possible at all for Octane :/
Or hopefully they just hadn't the develeopment ressources so far to do this. Maybe that will change now where they have some more devs.

Sun Jul 08, 2012 11:06 pm

Don't take my thoughts for law but when you start going through any other interface to access memory that is not directly on the card i think you need to start looking at the access speed...
The maximum bandwidth of a PCIe 3.0 x16 slot is 16GB/s whereas the bandwidth of GPU to VRAM on the same board is in the realms of 192GB/s for a 580/680.
I know there is a lot more to think about than just these numbers but they would be a factor and even just that is a factor of 12, so i can't see how accessing memory on another card would be anything but extremely costly in terms of performance...

Chris.

Mon Jul 09, 2012 12:12 am

.. it seems logical, since there is nothing new here, the memory obviously has to be accessed over the slow bus, the same problem as with cpu+gpu combos..

The solution is in probably in the hardware, a much faster PCIe bus in motherboards, is that even possible, my guess is that introducing faster bus speeds means a more expensive mainboard for manufacturers.. and users, if only 1% of people need this, why bother..
Are there any news on faster hardware regarding the bus speeds?

Mon Jul 09, 2012 12:41 am

PCIe 3.0 is the latest and some new motherboards and graphic cards like the 670/680 have this. The most common for current PC's would be PCIe 2.0 which is max 8GB/s.
I don't think it will be too many months until you won't be able to buy a motherboard with PCIe 2.0.
Since 3.0 is backwards compatible with 2.0 there will probably be no reason for manufactures to continue to produce 2.0 only boards.

Also not all motherboards support a full x16 PCIe buss for multiple graphics cards, so there are speed losses there too.
So your average current PC with 2 or more GPU's, probably have 1 or more running at x8 so you would be looking at 4GB/s for PCIe 2.0 @ x8.
When you compare that to the VRAM's 192GB/s, you would be looking at a factor of 48x (not including any other overhead). This wouldn't mean that a render would necessarily run 48 times slower as not all work is retrieving data from memory but i would think it would be crippling...

I think PCIe 4.0 is in the works but i would imagine this will be year(s) off.

Chris

Thu Jul 12, 2012 11:01 am

PCIe 2.0 - 5 GT/s (giga transfers/s) or 16gb/s
PCIe 3.0 - 8 GT/s - 32 gb/s (that what it says on asus website - I would guess it would be less then 2x the increase in speed)

PCIe 3.0 - 16 gt/s (maybe 64gb/s) release for specs in 2014/2015

OK - so when we compare this with VRAM's 192GB/s, the bus speed will reach that bandwidth in like 7-10 years !?
We don't have time to wait for that.

OK now I would really like octane to start using the CPU and mainboard memory - for those situations when you don't really care for the performace, you just need to get the project done and you want to use octane - there should be a switch that at least gives it access to the mainboard RAM for those situations - this is still a better solution than having to switch to vray or maxwell just because the project uses too much memory or there is a feature octane doesn't have..

Thu Jul 12, 2012 4:39 pm

Actually, the obvious answer here would be for Otoy to take up manufacturing of GTX670's with 8-16 GB of memory on Board and sell them as Octane Ready

Alternatively sub the manufacturing to a Chinese Parner.

If you were going that route, i think these are the people you'd need to talk to...

http://www.pcpartner.com/En/index.php