This is part II of my renderfarm experiment, dedicated to animation. The version of V5i used is build 278257. The protocol is the same as in part I, with the same renderfarm, but the goal this time is to evaluate the efficiency in animation, which is after all the main reason to use a renderfarm. This time I used the Windy Hill scene, which is available on the V5i CD. The parameters are the default : final rendering mode, 320 x 180, 250 frames rendered for a ten seconds shot @ 25 frames per second. The time needed to save the individual frames at the end is included (about 17 seconds). PC 1 added (Dual Opteron 275 @ 2442 MHz) : 23 mn 06 s (rendered as a rendercow node) PC 2 added (Dual Xeon @ 2800 MHz) : 16 mn 03 s PC 3 added (Pentium 4C @ 3200 MHz) : 13 mn 21 s PC 4 added (Pentium 4C @ 3400 Mhz) : 11 mn 31 s PC 5 added (Pentium 4C @ 3300 Mhz) : 10 mn 03 s PC 6 added (Pentium 4C @ 3300 MHz) : 8 mn 50 s PC 7 added (Athlon 64 @ 2450 MHz) : 7 mn 49 s PC 8 added (Athlon 64 @ 2400 MHz) : 7 mn 02 s PC 9 added (Athlon 64 @ 2400 MHz) : 6 mn 22 s PC 10 added (Athlon XP @ 2250 MHz) : 6 mn 03 s Details of the number of frames rendered and average frame time : PC1 : 61 f - 03s PC2 : 28 f - 06s PC3 : 18 f - 14s PC4 : 21 f - 10s PC5 : 20 f - 10s PC6 : 22 f - 09s PC7 : 24 f - 09s PC8 : 24 f - 08s PC9 : 23 f - 09s PC10 : 17 f - 15s You can see that the total number of frames rendered is superior to 250 (258). It seems a few frames were rendered more than once. As you can see the renderfarm rendered the animation 3.82 times faster than the fastest machine. This ratio is better than with a still image (see part I), but that was expected. Using the same ratio, we can deduce that a 24 hours long animation using the Dual Opteron 275 alone (at 5 mn 45 s per frame and 250 frames) would be rendered in 6 h 17 mn by the entire renderfarm. The same animation would take about 4 days on a single Athlon XP 2250 MHz machine. Interestingly, I did not experienced a single crash and did not have to quit and reload a single time, even when adding the rendercows one after the other and rendering again. HyperVue seems much more stable in animation than in single image tile rendering.

More useful info, thanks for taking the trouble :) I guess the final ratio figure - 3.82 sounds at fist a little disappointing, until you realise that is from the POV of the fastest machine. If the ratio were taken from the POV of a medium machine the result might be a little more balanced. Of course, the ideal test would be made using an array of identical machines, but obviously that can't be done here, which is why I would suggest the ratio be measured from a medium machine.

Great info and tests. One comment however is that, with such low average render times on most machines, particularly the Opteron, the overhead of simple file I/O, frame switching, etc. may be skewing the results. In other words if rendering a frame takes an average of 3 seconds, and the combination of file I/O, frame switching, etc. adds an overhead of even 1 second (which is more than likely), a full 1/4 of your "render time" is composed of non-render operations! The nice thing about this is that, while render time scales with frame resolution and scene complexity, overhead will tend to remain roughly the same (probably increasing slightly with complex scenes). So basically what you want is to increase the ratio of render time to overhead so that overhead comprises a much smaller percentage of your total render time, and thus you get a more meaningful idea of how much extra machines will really help performance. I suggest individual frame render times of 30 seconds to 1 minute on your median machine for best results. The results may not turn out significantly different, but it's worth the try. - Oshyan

DMM,
It depends if you have an optimistic or a pessimistic point of view :) Personally I think seeing from the POV of the fastest computer is more realistic. After all if I could render on one computer only, it would be the one chosen. So all additional gain should be compared with this one.

JavaJones,
The overhead here is negligible. The scene is copied once on each cow at the beginning of the render. The image resolution of the frames is small, so sending information over the gigabit network is not an issue. And the HyperVue manager is really a Quad-CPU. At the beginning of the render sending the scene and all the textures is done in a few seconds, and the individual frames are saved only at the end.

Yeah I see your point Louquet, but as it stands the results are only meaningful to yor specific setup. Which is fine, you're posting information based on your setup and I have absolutely no quibble with it :) An accurate test which is useful to anyone would of course be a test with 10 identical PCs, which is why I mentioned basing the result on the most "average" PC you had. But in any case, it was a very nice little test you did there, and it made for an interesting read. Cheers :D

For animation, I think it's quite simple : with 10 identical PCs, it would be about 10 times faster than with one PC :)

There is always overhead. You may be right that it would have a minimal impact, but with render times that low I have my doubts. Only one way to find out of course... ;) - Oshyan

JavaJones, Look, this was made for a simple demonstration, with low resolution frames in order to do the test fairly quickly. If there is a non-measurable overhead in those conditions I'm willing to live with it :) At real resolutions like 768 x 576 it would be even more negligible. Now that was for my network, it is perfectly possible that the overhead is not negligible with other configurations. If the HyperVue render manager is slow, it might have problems to compute as a rendercow and send all the information to all the cows. That's why having a SMP PC as the manager is a plus. If anyone do the same kind of test I am of course interested to see the results.

louget I'm not trying to knock what you've done in any way, I think you're providing some great info and I'm certainly appreciative. I'm only trying to offer advice from my own experience that might be helpful in ensuring accurate results. I have some personal experience in benchmarking and it's always been a subject of interest to me so I've done a good deal of research on it. One of the major consistent issues is minimizing non-related factors in your benchmark. For example if you're testing a game and want to find out what the maximum CPU bound framerate is, you want to ensure that the graphics card will not be a bottleneck, so you first use the fastest graphics card you have available, and 2nd you run the benchmark at extremely low resolution, since higher graphics resolutions depend almost entirely on the video card. The CPU will be stressed fully this way. The same principles apply to your situation of course. I'm sure I am not telling you anything you don't know, but perhaps this will explain to others what I am talking about. In any case it's all well and good to say that the overhead has no noticeable effect, but the real question is have you measured it? If so, then I'm sure you're correct and your results may reasonably used to extrapolate for other potential situations. But if not, and you are simply making a judgment call based on your experience, I'm sorry but I think that calls the results into some amount of question. The overhead is an unknown quantity - low, according to your experience, which I'm certainly willing to accept, but not necessarily immeasurably low, and when you're dealing with such short render times, as I said the influence of even a small overhead could be significant. Of course it may not be your intention that others extrapolate from your data to other potential situations, but people are likely to do so regardless. It is no duty of yours, but would certainly be appreciated - if you're going to publicize your results - if you would practice due diligence in ensuring their accuracy. It would be of benefit to all, including yourself. All that being said it's likely that if you retested with a longer rendering scene, the results would be similar and I might look a fool. But the principle of what I'm saying holds true. You simply can't safely make any assumptions when benchmarking, otherwise your results are largely useless. Last but not least I'd like to reiterate: please don't take offense to any of this. I am not saying I know better than you. I am largely regurgitating what I have learned in my own years of personal benchmark experience and especially learning from others who know more and have greater experience than I. Take that as you will, but above all please don't be upset by it. Life's too short. ;) - Oshyan

I am not upset at all :) Your concerns are certainly valid. As I test hardware and software for a living, I have no problem to agree with you on the subject of benchmarking. But this experiment was an informal one, done to provide some information to the community, but not designed to be extrapolated to every setup nor seen in a scientific way. The 'negligible overhead' means that, in my case, I have never been bothered by it, with small images or with high-resolution frames and 30 minutes per frame. That does not mean that overhead doesn't exist of course, only that I do not see it as annoying. I see 'annoying overhead' when experimenting with distant network rendering. Sending large scenes across the internet to distant computers takes a looong time. But sending scenes and textures and getting back the data across a gigabit network takes a negligible time compared to rendering those large scenes, that's what I think is the most important thing :)

Ah, I see where you're coming from now. Well, from the perspective of practical use the overhead is undeniably minimal (not annoying, in other words). But from a benchmarking perspective, with such small sample times, you'd need to at least attempt to measure the overhead and subtract it (if it were significant enough to do so) in order to get a more accurate benchmark. One from which the results might safely be used to extrapolate to other hardware. Personally I would really appreciate seeing a slightly longer test for that reason - as I said, 30 seconds to 1 minute per frame. It need not even be as many frames. I think as long as each machine has a chance to render 5 frames, it's probably a good test. So that's 50 frames, or less than an hour of rendering. Up to you, but that's my humble request. Thanks for listening, and for sharing your results with us. :) - Oshyan

I just re-ran the first 100 frames of the Windy Hill animation in 720 x 405 resolution. The Opteron alone did the job in 50 mn 06 s, that is roughly 30 seconds per frame. The entire renderfarm did the job in 14 mn 02 s. So the ratio in this case was 3.57.

Interesting! So the efficiency was actually reduced. Any theories as to why that is? - Oshyan

Not the same frames sent to the same machines... problems with a few frames during rendering so that they had to be recomputed at the end... PC doing other things on the network during rendering (updating antivirus, getting big mail, defragmentation). As the rendercows are low priority, computing time can easily be eaten by other apps. So a lot of possible reasons, and as it is a practical experiment done in real conditions, I did not want to disable all background programs running.

Ah, ok. I thought you were doing all tests "clean". Otherwise, while they may be "practical" and "real world", they are unfortunately not reproducible, and thus not comparable to previous results. Simulating "real world conditions" for non-dedicated use situations is a whole art in itself. - Oshyan

" For animation, I think it's quite simple : with 10 identical PCs, it would be about 10 times faster than with one PC :)" I doubt it. I think your figures reflect the true nature of distributed rendering efficienty. I've had a deal of experience working with mental ray on farms of up to 1500 nodes. In the early days we looked at distributed rendering - which is what Vue/Rendercow does. The ratio of rendertime to number of machines is never linear when you're rendering in this way. hard truth, this kind of rendering is nowhere near as efficient as rendering one complete frame per machine. If you start scaling this up to movie size then the fun really starts. I'd be willing to bet once you get up into the hundreds of CPUs the time per frame will actually drop below the time it takes to render one frame on a single machine. Rendercow rendering is attractive but it's not effiicient.

Forum: Vue

Subject: The renderfarm experiment, part II (animation)