Yambo wrote:I'm currently running my 2X Titan X at 16x PCIe speed on my X99 mobo, and i was wondering how much the compiling time of the scene will increase on 8x speed, and if it will be linear? (e.g double the time)
So i prepared a sample "heavy" scene, on my machine (specs below) it took 1:07 minutes to compile the geometry (from the moment i hit the render target button till the moment the image appears)
I sent the scene over to Sebastian (smicha), we both thought the compiling time will be almost double since he running at 8x speed, but no! it was faster! Sebastian quote:
So opening the scene takes 7.5 seconds. When I hit render on render target it takes 54 seconds the image to appear. I have 4x 4.5GHz 2600k - 5 year old cpu. What I noticed in a task manager that around 25th second of compilation the cpu is stressed at 100% on all cores. So the CPU speed may affect the overall time. But your 5930k is 6 core and is faster at stock than my 4 core at 4.5ghz. I expected longer times at mine rig, not 15 seconds faster.
Some notes:
• Video attached with recording of my test
• i did the test on Octane 3.00 and on Octane 3.04 - same speed
• the orbx for the test could be found here:
https://goo.gl/nrWqi0 so please feel free to do the same test on your machine
• screencap from GPU-Z attached for my PCIe 16x speed
• My machine didn't stress the CPU to 100% on the test while smicha machine does
Tutor, (Or any other member that running dual xeons) Would you mind to do this test please and post here what your compiling times? (If you confirms that cpu matters in loading geometry it will be a greatest move to go with dual cpus on my upcoming build)
Any thoughts?
It may take me a couple of days to run the scene compile file because I haven't yet installed or tried V3 alpha and I'll be running the test file between my current projects. I do look forward to testing V3 alpha with that complex scene file. Here are my initial thoughts:
WHY THE LASTEST ISN’T ALWAYS THE GREATEST PERFORMER - OR THE RAMBLINGS OF AN OLD MAN
However, before I have the time to run the tests, let me point out some things. Scene compilation is CPU/system memory bound and PCie speed affected function. I’ve been a CPU tweaker/system speed hacker since the mid-1980s. I’ve tweaked Intel, AMD, IBM and Motorola CPUs, modding and substituting CPUs for Atari, Amiga, Apple [
http://www.computerworld.com/article/24 ... -pro-.html ], Windows and Linux systems. I’ve been a Hackintosher [
http://www.insanelymac.com/forum/topic/ ... ch-scores/ ]
I've underclocked hackintoshes and manipulated turboboost stages to achieve the fastest system running OSX [
http://forums.macrumors.com/threads/all ... e.1333421/ ]. I also tweak and mod GPUs. It was and is all done in my pursuit to have the fastest systems that my little duckets could/can buy for my business.
1) A WALK DOWN MEMORY LANE
So with that background, I give you the following little historical tour along Intel’s CPU roadmap and it’s impact on CPU tweaking:
Once upon a time one could overclock a Nehalem (Xeon 3500/5500s) (the tock) and Westmere (Xeon 2600/5600s) (the tick) CPUs and only affect the speed of CPU core and memory speeds. Then overclocking was only limited by the overclocker’s skills, the quality of the CPU and the capacity of the memory to operate at the overclocked speed. The same applied to non-Xeons.
If you overclocked the CPU, then the memory was overclocked by the same percentage so you had to ensure that your system memory was up to the task or you had to designate in bios that the memory was of a lesser speed so that when the CPU overclock was applied it would not automatically kick the memory speed beyond it rated limit. Well, Intel didn’t like users having that flexibility. What Intel did, besides locking the Sandy Bridge (and later) Xeons, was to put a whole host of functions, that had been separately configurable via some bioses on some Nehalem and Westmeres motherboards, under the control of DMI. So if you could overclock or underclock the Xeons, it also affected a whole host of things (I call them "the whole family"), such as your system's PCIe frequency, HDs, SSDs, USB, video, etc. That's not a pretty sight for all of these functions to be overclocked to the same extent (other than a very minimal overclock, say .1 to .4 unless you’re very lucky and even if you’re very lucky you have to contend with .75 or higher overclocks being guaranteed to bring your very best system to a halt). On Intel's enthusiast CPUs, you could pay a toll by getting a CPU with a "K" in its model name and for that toll you got "straps" to hold everything else down so you could tweak the CPU and the memory a little more liberally. But Xeons couldn't be strapped down. Ask Sandy Bee Xeon (or any post—Westmere Xeon) out for a date and you have to take the whole family out too for every one of them gets treated the same. Few motherboards designed for any Xeons allow any degree of overclocking. While enthusiast motherboards can handle Xeons, subject to “taking the whole family out for a date,” few professional motherboard makers provide for any overclocking, an important exception being Supermicro’s DAX lineup (otherwise called - “Hyperspeed”) built especially for Day-Traders [
http://www.supermicro.com/SearchToolkit ... t_CSE.aspx ]. Remember also that some snapshots that you may have seen of CPU usage was not that of discrete CPUs, but rather depicts cores within one or more CPUs. Not all cores are subject to the very same stressors. Inner cores get hotter that outer cores and as cores get hotter and reach certain thresholds their speed of operation slows down. So whether a CPU is itself water-cooled matters much. If the load on the CPU is sufficient to tax and is made to take advantage of all of the cores, a snapshot of the cores usage can give you indications of which are most likely the innermost and outermost cores in the package.
Moreover, consider that not all tasks need or can even take advantage of all of a CPU’s cores. Could scene compilation be one such task? To be sure, more CPUs with sufficient ram provide an environment for more IO space being available to support more GPUs (and my observation is that newer, more powerful GPUs require more IO space than do older GPUs), but is there a sweet spot? I do not now know where that spot is right now, but I strongly believe that the spot will continue to move as GPUs become more formidable and complex.
2) SOMETIMES, THE QUICK AND DIRTY ANSWER MIGHT BE BOUND TO THE TINIEST STALK IN THE BALE OF ROTTING HAY
For those of you who have read this far and might be asking, “Tutor, why drag us down Memory Lane?” - I respond, “By knowing what has happened on memory lane might give you better clues about what we’re observing. So what else might be happening when a shiny new 5930k 6-core
[
http://www.cpu-world.com/CPUs/Core_i7/I ... 5930K.html -
Standard Frequency 3,500 MHz;
Turbo Stage 1 = 3600 MHz (3 cores or more);
Turbo Stage 2 = 3700 MHz (1 or 2 cores)]
which may appear, at first blush, to be faster at stock than someone else’s "old" 4-core 2600k
[
http://www.cpu-world.com/CPUs/Core_i7/I ... 33908.html -
Standard Frequency 3,400 MHz;
Turbo Stage 1 = 3500 MHz (4 cores);
Turbo Stage 2 = 3600 MHz (3 cores);
Turbo Stage 3 = 3700 MHz (2 cores);
Turbo Stage 4 = 3800 MHz (1 core)]
overclocked to run at 4.5 GHz (AND unless Turbo gets turned off, all of those stages get bumped up accordingly with over clocking, i.e.,
Turbo Stage 1 = 4600 MHz (4 cores);
Turbo Stage 2 = 4700 MHz (3 cores);
Turbo Stage 3 = 4800 MHz (2 cores);
Turbo Stage 4 = 4900 MHz (1 core)]
gets beat by that "old" 2600k?
It might be a symptom or outcome of the breath of what overclocking’s myriad of effects can bring about. In sum, at what speed are all of the other variables that affect the desired outcome (aka - fast/short compilation time) being done.
P.S. Just for the heck of it, compare the CPU speed ratios between the two systems' CPUs and compare it to the compilation time delta. Or, even better yet, overclock that newer CPU to the same extent as that older CPU and let us know the outcome.
Because I have 180+ GPU processers in 16 tweaked/multiOS systems - Character limit prevents detailed stats.