DMFW opened this issue on Oct 17, 2005 ยท 15 posts
JavaJones posted Wed, 19 October 2005 at 4:27 PM
HT is really just a way of taking advantage of otherwise "lost" CPU cycles due to pipeline bubbles and stalls, which are more prevalent on the P4 architecture due to its extremely long instruction pipeline and thus higher branch misprediction, etc. penalty. People often wonder why AMD hasn't implemented an HT unit and the answer is it wouldn't do much good, AMD CPU's tend to be running "flat out" more of the time because they're more efficiently architected. Generally this seems like a plus, but when you're multitasking the HT approach is a boon. Now that we have dual core I think it will be phased out, especially as Intel moves to CPU's based on their Pentium M mobile core, which itself is much more efficient (and shorter pipelined) than the P4.
In any case, because HT is just "taking up the slack" and is nowhere near the equal of a full 2nd CPU, it can never get much more than 20% of additional performance, unless the CPU was severely underutilized by the application in question without HT. This is the case in Terragen for example, where 2 TG threads on an HT P4 can net up to 60% performance increase - but this is the exception to the rule by far. And the trade-off in that case anyway is that a single Terragen thread runs abotu 25% slower than an equivalent Athlon system.
It is easier to wrote local multithreaded code than it is to write networked multicoded thread. This is for the obvious reason that local multithreading does not need to deal with the overhead and unreliability of networking and its other issues. In other words utilizing a local dual core or dual processor machine will be easier and more efficient than utilizing a networked farm of multiple machines of equivalent power. That being said the farm solution would generally be cheaper, as has been said.
As far as memory limitations, most rendering is not particularly memory bound. You can see in rendering benchmarks like http://blanos.com/benchmark/ http://www.tabsnet.com/ and http://tgbench.kk3d.de that similar machines with different memory speeds and sizes do not tend to make a significant difference. Rendering spends far more in CPU execution time than it does in transferring large amounts of data around (this is unlike gaming which tends to require a lot more data transfer).
You're also contending with the memory limits of the operating system, especially on the PC side. If you have 3GB of RAM, you can assume 1GB is dedicated to the OS and the other 2 to your application. Having more memory wouldn't help because the application can only use 2GB max on a non-64 bit Windows system. And let's not forget Vue isn't 64 bit yet anyway.
Also keep in mind that running a multithreaded application locally is more efficient memory-wise because the scene is already in memory - both threads can access the same memory pool. It's not like each thread has to separately load the entire scene or anything. So let's say you have a scene that requires a full 2GB of RAM - would it be more cost-effective to buy 3GB of RAM for a dual core system and multithread locally, or buy 3GB of RAM for two separate machines and do a network render? The answer, of course, is the single dual core machine. This starts getting less attractive as an option when you start looking at Opteron's or other high end CPU's of course, but now that there are mainstream dual core CPU's this isn't really necessary.
Also don't forget the power and space needs of a renderfarm solution. A single dual core machine will still have about half the power needs of two single core machines, and of course exactly half the space needs. :D
So to get back around to the original point, I'd personally recommend an Athlon X2 system right now. I'd recommend going for the highest clocked version you can, but don't worry too much about the cache size. For example you can get the Athlon 64 X2 4200+ at 2.2Ghz with 2x512KB cache (1 for each core) for $470 and the 4400+ with 2x1MB cache for $530 - $60 more. The 4400+ has larger caches, but in most applications that will not translate into a significant performance increase, so go for the 4200+. It's probably the best price/performance ratio for the X2's right now anyway, although the 4600+ at 2.4Ghz is tempting. Also note the 3800+ at 2.0Ghz and $347 is actually less than 2x the cost of a normal 2.0Ghz Athlon 64, the 3200+ at $190. That means you're no longer paying a premium for dual core, in fact it's more cost-effective (just as it should be). That is not counting the fact that you don't need to buy a 2nd case, CD/DVD-ROM, hard drive, memory, etc. Shared memory is only an issue if you're running 2 different applications, or the application you're running is not properly multithreaded. But if you do plan on multitasking with memory-intensive applications a lot, definitely go for as much memory as you can.
So the short answer: get an Athlon 64 X2 system with 3+GB of RAM. You'll be king of the block. :D If you don't want build it yourself (the cheapest and most versatile approach), these guys will build you a sweet machine on the cheap: http://www.monarchcomputer.com/ Save some money over Alienware, etc. ;)
Message edited on: 10/19/2005 16:29
Message edited on: 10/19/2005 16:30