Compiling scene time not related only to the PCIe speed?

Sun Feb 07, 2016 10:57 am

My bios settings - PLEASE DON'T USE IT AS A REFERENCE FOR YOUR BIOS - IT IS ONLY FOR SANDY BRIDGE 2600K (WATERCOOLED).

Sun Feb 07, 2016 11:47 am

glimpse wrote:While compiling scene You see different load levels (25% or 100%) because, there are different workloads to be done & while some of them optimised for multithreaded CPUs (stage where core count matters) ..other still relly on single core performance (that's where speed matters)..

Tom,

But only my cpu is stressed to 100%, others to 70% max (5930k). Why?

Sun Feb 07, 2016 12:21 pm

So I've update my post with OCed results.

smicha wrote: But only my cpu is stressed to 100%, others to 70% max (5930k). Why?

why? well, I can only guess, but 70% seems to be 4cores out of 6 + few % of some system load under full stress (& 2 cores running iddle =) it could be because this multi threaded performance in v3 for now is flat out with 4cores..

in this case HT (that makes 8cores out of 4 cores) should actually cause some issues - that's why guys with 3770k see twice longer compile compared to my result with weaker (not HT CPU from the same familly).

Loading the scene on the other hand is influenced by storage solution.

Things might change in the future once OTOY will code al lthe features & start optimising everything around =) thus getting more than 4cores + OC is well worth the effort..unless You would like to sell CPU later..

& Sebastian, I'm not surprised that Your systems does those things faster. Sandy Bridge was exceptional CPU (pretty toasty & thursty, but..good thermal interface to shine under water) Intel does a bit of architectural changes, but performance wasn't boosted too much afterwards.. My guess that I would outrun You with 3570k even with less OC if I put my system under warter with slightly higher OC (as Ivy has some architecture changes making it a bit more efficient - but no more than few %). However 4.5Gh is the most I would run without deliding..as I noticed some stability issues around that mark on my CPU.

Sun Feb 07, 2016 12:27 pm

I am curious what OTOY will say - if compilation algorithm makes use of 4 cores, etc... That would waste tons of CPU power that dual Xeon users have.

Sun Feb 07, 2016 12:40 pm

I would like to hear something from OTOYs devs as well,..but if have to guess..on the long run we probably going to see better optimisations anyway (if that's the case). I would not be surprised to see OTOY focusing on GPU stuff for now as GDC & GTC is comming..v3 is still in earlly stage & they are probably too bussy to stear out of their plan & start rounding corners or polishing.

Sun Feb 07, 2016 12:43 pm

Right.

PS. and what could happen that pegot and grimm got almost 3-4x longer compilation times....

Sun Feb 07, 2016 12:48 pm

smicha wrote:Right.

PS. and what could happen that pegot and grimm got almost 3-4x longer compilation times....

why? well, I can only guess, but 70% seems to be 4cores out of 6 + few % of some system load under full stress (& 2 cores running iddle =) it could be because this multi threaded performance in v3 for now is flat out with 4cores..

in this case HT (that makes 8cores out of 4 cores) should actually cause some issues - that's why guys with 3770k see twice longer compile compared to my result with weaker (not HT CPU from the same familly).

from - at least that would be my guess, I might be wrong but it would be cool to see performance from those Guys with HyperThreading disabled =)

P.S. in real world..HT only gives You around 5-10% of boost in well optimised aplications anyway..

Sun Feb 07, 2016 12:53 pm

I don't think that HT on is a factor at all - with disabled HT I get same results. It's rather how the code is written and optimized for multithreading (not HT).

Sun Feb 07, 2016 1:13 pm

smicha wrote:I don't think that HT on is a factor at all - with disabled HT I get same results. It's rather how the code is written and optimized for multithreading (not HT).

You can be right over here, but I would rather cound on numbers..if some Guys with 3770k & current 6core could test those ideas.. (Your case might differ due to older architecture..)

smicha wrote:That would waste tons of CPU power that dual Xeon users have.

it wouldn't ..if You choose SKUs accordingly.. I guess something like 2x higher clockspeeds runing 4Core Xeons might make a good sence from what we got now.. but if someOne has better ideas, let's talk further.. Yeah if You end up gettign 12 or 18core CPUs.. those do not make sense at all..especially if any bottleck exists preventing to utilse those extra cores, not to mention poor single core performance.

Sun Feb 07, 2016 4:51 pm

glimpse wrote:
smicha wrote:I don't think that HT on is a factor at all - with disabled HT I get same results. It's rather how the code is written and optimized for multithreading (not HT).
You can be right over here, but I would rather cound on numbers..if some Guys with 3770k & current 6core could test those ideas.. (Your case might differ due to older architecture..)

smicha wrote:That would waste tons of CPU power that dual Xeon users have.
it wouldn't ..if You choose SKUs accordingly.. I guess something like 2x higher clockspeeds runing 4Core Xeons might make a good sence from what we got now.. but if someOne has better ideas, let's talk further.. Yeah if You end up gettign 12 or 18core CPUs.. those do not make sense at all..especially if any bottleck exists preventing to utilse those extra cores, not to mention poor single core performance.

Hello Glimpse,

Seems like no truer quesses than yours and Smicha's have been made. I just ran the compile file on one of my two 32 core/4xE5-4650 Sandy Bridges (2.7 GHz base/turbo up to 3.5 GHz) Supermicro Servers (no overclocking possible there) and the first compile (using Octane Alpha 3.04) took 2 min. 10 secs. (w/no hyper-threading) and 2 min 11 sec. on the second try with hyper-threading. Significantly, at least to me, this system achieved the highest Cinebench 15 score in 2013 and it's still the second highest listed Cinebench score [ http://cbscores.com ]. So significantly unlike Cinema 4d which craves massive amounts of cores for CPU rendering, distributing compilation in Octane Alpha 3.04 over many CPU cores using the test scene now just trashes Octane compilation times. However, those server systems can handle the IO space requirements for large numbers of GPUs. What a contrasting/conflicting shame! The more cores you throw at compiling the longer it takes. Ironically, if the scene could be easily converted to one for a CPU renderer, I could probably render it faster on that server than compiling it for GPU rendering.