Introduction
AMD has had mastery of the budget end of the processor market for some time now for reasons including price/performance, low motherboard prices and platform longevity (they don’t change sockets at the drop of a hat). The only downside has been the sacrificing of the high performance market to Intel (albeit at a much higher price). Recently Intel launched its 6-core processor the i7-980X at the usual "Extreme Edition" price of around a $1000 (or a £1000 if you happen to live in the UK due to sales tax and other historical factors) putting it out of reach of all but a few enthusiasts and professionals in specialized fields such as video editing.
Today AMD is launching its own 6-core processor code named Thuban. Two models are launched today, the Phenom II X6 1090T (3.2GHz stock and up to 3.6GHz with Turbo Core) and the Phenom II X6 1055T (2.8GHz stock and up to 3.3GHz with Turbo Core). Not only are these launching at an aggressive clock speed and with a boosting technology to rival Intel's Turbo Mode, the estimated street price for the flagship model is under the $300 dollar mark. We have tested the 1090T and it promises to really shake up the current status quo with performance that in some cases beats the best Intel CPUs available.
By spending a long time duplicating our tests six times we are able to see how various applications perform with differing numbers of cores allowing us to establish the multi-core efficiency of games such as Far Cry 2 and benchmarking tools like 3D Mark Vantage. The testing is by no means comprehensive and if we had 2-3 weeks to spare we could have tested every recent game and application for completeness so our apologies in advance if your favourite application is not included in our representative sampling.
Of more universal interest is comparing the efficiencies of the latest Intel and AMD architecture to compare current and future efficiencies and predict how future trends and architectures will affect performance.
To go with the Thuban launch we are getting a new chipset, the 890FX, which promises better performance and greater headroom for overclocking. The board we tested with was the ASUS Crosshair IV Formula.
SATA-3 is now standard although USB 3.0 still has to be provided by 3rd party hardware (NEC in our case). Now on to the die that has been the subject of intense speculation these last few months:
Each core has 64KB of L1 data and instruction cache and 512KB of L2 cache. 6MB of L3 cache is shared between the cores. A 45nm process and manufacturing optimizations keep the processor within a thermal envelope of 125W despite the addition of two extra cores. This TDP will be of key importance when we discuss the Turbo Core feature.
The X6 range will fit into a standard AM3 socket (a BIOS update may be required for current/recent motherboards) showing AMDs commitment to platform longevity and ease of upgrading.
We received final shipping product for our testing and not an engineering sample so we are confident that our tests will reflect the actual performance that consumers will experience.
Intel has been using it's Turbo Mode for some time now with i5/i7 processors to boost the speed of one or two cores by a few steps when thermal envelopes allow. The greatest benefit is gained in applications that are not highly threaded and so cannot otherwise fully utilize all available cores. AMD now also have this feature built into their latest range in the form of Turbo Core, which allows for 3 cores to be boosted by up to 500MHz when the other 3 are at low utilization. This is more than was originally expected and is done by cleverly reducing the speed of unused cores to 800MHz and lowering voltage correspondingly while increasing voltage to the boosted cores. This is all done automatically by the processor although some motherboards (such as the ASUS one we tested with) allow the Turbo Core feature to be tweaked independently of the usual CPU adjustments. The net effect of this is to maximise processor performance with any type of application while staying within the 125W thermal envelope.
Traditionally, AMD processors have been more difficult to overclock than their Intel rivals with most users able to boast modest overclocks without exotic cooling.
We used a Corsair H50 which gives the benefits of water cooling with the ease of installation of an air-cooled HSF. In terms of cost and performance it is similar to a high end air cooling solution but without the bulky heatsink or noisy CPU fan. Please not that due to the small reservoir on these sealed budget water block and radiator combo systems they should not be used for extreme overclocking and if the processor temperature gets above 70 degrees Celsius it should be brought back down immediately to prevent water turning to steam and permanently “unsealing” the system.
The CPU-Z screens show all the relevant information. Here the processor is running under load at stock speeds but will throttle back to 800MHz when idle or at low utilization.
The screen shots above are quite real - the Phenom II X6 1090T booted straight into Windows at over 4GHz with our motherboard taking care of all adjustments. That was our first attempt at overclocking and had we the time before the launch deadline we would have seen just how far we could go. Given the time constraints we were only able to run some benchmarks in Everest Ultimate edition and these are shown later on. The system was stable at 4GHz under stress testing for the 1 hour we could spare for that purpose.
AMD's new manufacturing process should have overclockers rubbing their hands with glee especially given the price. We estimate that an entire system based around the Phenom II X6 1090T including monitor and budget SSD can be purchased for the price of an Intel i7-980X processor alone.
The Problem with Multi-Tasking Since this review is primarily about multi-core efficiency it is worth explaining the inherent problems with multi-tasking. This may surprise some readers as we already have supercomputers made up of thousands of Intel or AMD processors and if they did not scale well then research institutions would not buy them to predict climate change, where minerals are buried and so on. The reason they work so well is that it is easy to split millions of operations among thousands of cores. Splitting one thread across multiple cores is actually quite difficult.
The problem involves concurrency, monitors and semaphores and is too involved to go into here although interested readers are encouraged to read the Wikipedia article on “Dining Philosophers” which explains the whole problem in easy to visualize terms. It can be found here. Until Quantum Computing is viable we will have to rely on programmers making allowances for multiple cores and programming accordingly. Some games and applications are already optimized to a limited degree for multiple cores and theoretically every application will get a boost with a second core, even if just by offloading the usual Windows background processes to the other unused core.
It has been clear for some years that frequencies cannot continue to increase due to manufacturing limits and have remained roughly constant around the 3GHz mark for about 6 years. Instead it seems that the future gains will be attained by increasing the number of cores in a CPU, whether physical or also virtual (as with HyperThreading). Our test will aim to show which architectures are most suited to getting the best out of extra cores, where the bottlenecks are and, hopefully, give an indication of how the architectures will scale in the future as number of cores increase.
Test Configuration |
System Hardware |
CPU | Intel Core i7-870 (2.93 GHz, 8MB Cache | AMD Phenom 2 X6 1090T (3.2 GHz, 6MB Cache) |
Motherboard | | ASUS Crosshair IV Formula |
CPU Cooler | Corsair H50 | Corsair H50 |
RAM | Kingston
CL8 (Kit of 2) Intel XMP Tall HS CAS 8-8-8-2 KHX2133C8D3T1K2/4GX 4GB 2133MHz DDR3 Non-ECC4 | Kingston KHX1600C8D3T1K2/4GX 4GB 1600MHz DDR3 T1 Series Non-ECC
CL8 DIMM (Kit of 2) XMP CAS 8-8-8-24 |
Graphics | | |
Hard Drive | Maxtor 300GB SATA-2 | Maxtor 300GB SATA-2 |
Sound | SupremeFX X-Fi built-in | Realtek® 1200 8 -Channel High Definition Audio CODEC |
Network | Gigabit LAN controller | Realtek® 8112 Gigabit LAN controller |
Chassis | Antec 902 Midi Tower Case | Antec P183 Ultra Quiet Case |
Power | | |
Software |
Operating System | Windows 7 Professional | Windows 7 Professional |
Graphics | ATI Catalyst 10.3 | ATI Catalyst 10.3 |
Chipset | Intel P55 | AMD 890FX |
Applications | -
SiSoft Sandra 2009
-
3DMark Vantage Pro
-
PCMark Vantage Pro
-
Everest Ultimate
-
CPU-Z
-
Far Cry 2
-
HAWX
- Resident Evil 5
| -
SiSoft Sandra 2009
-
3DMark Vantage Pro
-
PCMark Vantage Pro
-
Everest Ultimate
-
CPU-Z
-
Far Cry 2
-
HAWX
- Resident Evil 5
|
All games are tested at the maximum available settings and initially at 1280x1024 so we can be sure of hitting CPU limitations before bandwidth or fill rate ones related to the GPU. We selected Far Cry 2 (first person shooter), HAWX (air combat) and Resident Evil 5 (horror) for our tests as they are newish titles that are suited to benchmarking and make most systems struggle.
Test Results - SiSoft Sandra
The results show fairly linear scaling as we go up in cores. It should be noted that synthetic tests such as SiSoft Sandra will scale quite well and are mainly useful as an indication of bottlenecks and to see what programmers can achieve if they overcome the hurdles they face. The Thuban processor is able to match its costlier Intel rival.
The processor multimedia results also scale well although real-life differences will not be as pronounced as this chart indicates. Here the newest AMD processor takes a clear lead by virtue of extra cores.
Interestingly, the memory bandwidth results show that a single core cannot make full use of available capacity and is particularly the case for the AMD Phenom II architecture. Dual core or higher is required to overcome this limitation. Ultimately, the 2000MHz DDR3 of the Intel platform makes all the difference over the 1600MHz DDR3 the AMD system has.
Test Results - Everest Ultimate Edition Everest is a very comprehensive benchmark suite that is set to take the synthetic crown from SiSoft Sandra. We limited our testing to the CPU and FPU benchmarks provided.
CPU Queen is a simple integer benchmark which focuses on the branch prediction capabilities and the misprediction penalties of the CPU. It finds the solutions for the classic "Queens problem" on a 10 by 10 sized chessboard. CPU Photoworx is an integer benchmark that performs different common tasks used during digital photo processing. CPU Zlib is an integer benchmark that measures combined CPU and memory subsystem performance through the public ZLib compression library. CPU ZLib test uses only the basic x86 instructions, and it is HyperThreading, multi-processor (SMP) and multi-core (CMP) aware. CPU AES is an integer benchmark that measures CPU performance using AES (a.k.a. Rijndael) data encryption. It utilizes Vincent Rijmen, Antoon Bosselaers and Paulo Barreto's public domain C code in ECB mode.
Since all of these tests are fully threaded we see a linear increase in performance as number of cores increases. It is worth noting the effect of overclocking to 4GHz - a massive 25% overclock with a commensurate increase in performance.
The FPU Julia benchmark measures the single precision (also known as 32-bit) floating-point performance through the computation of several frames of the popular "Julia" fractal. The code behind this benchmark method is written in Assembly, and it is extremely optimized for every popular AMD and Intel processor core variants by utilizing the appropriate x87, 3DNow!, 3DNow!+ or SSE instruction set extension.
The FPU Mandel benchmark measures the double precision (also known as 64-bit) floating-point performance through the computation of several frames of the popular "Mandelbrot" fractal. The code behind this benchmark method is written in Assembly, and it is extremely optimized for every popular AMD and Intel processor core variants by utilizing the appropriate x87 or SSE2 instruction set extension.
The FPU SinJulia benchmark measures the extended precision (also known as 80-bit) floating-point performance through the computation of a single frame of a modified "Julia" fractal. The code behind this benchmark method is written in Assembly, and it is extremely optimized for every popular AMD and Intel processor core variants by utilizing trigonometric and exponential x87 instructions.
As with the CPU tests, the FPU benchmarks are highly threaded and we can see a linear performance increase with number of cores.
Test Results - PC Mark Vantage Pro PC Mark Vantage tests a whole range of activities from web browsing to photo manipulation and music conversion.
Performance is fairly consistent
Across all resolutions so it makes little difference other than common sense suggesting the highest resolution be used
For this type of activity for ease of use.
Test Results - 3D Mark Vantage Pro Of much more interest to gamers is 3D Mark Vantage and this is the de facto standard for synthetic 3D graphics benchmarks for a wide variety of gaming types.
Performance scales well except for single cores which just don't have the raw power to get the job done. There is just a hint of a leveling out at high numbers of cores but we will need to wait for 8+ core processors to confirm this.
The CPU score is of most interest to us and here we can see something quite interesting. While the i7-870 seems to be hitting some kind of bottleneck at around 3 cores, the AMD processors are scaling linearly all the way up to 6 cores. This bodes extremely well for AMDs architecture and for future systems with even more cores.
The only purpose of the GPU benchmark component is to show that even a single core can make full use of our Radeon 5850 card. We would really like to return to this benchmark when we can get our hands on dual Radeon 5970 cards to see just how many cores are needed to feed high end Crossfire / SLI setups.
Test Results - MultiCore Analysis So far we have looked at synthetic benchmarks and these tend to be well threaded to make full use of all available cores. This is not always the case in the real world and now we look at some recent 3D games with the emphasis on core scaling. Tests are run at 1280x1024 to avoid any GPU limitations at high resolutions.
In Far Cry it seems that a dual core processor is just as good as a quad or hex core one. Although fairly recent, this game was designed some time ago and we have learned that future games from the developers will be fully threaded to take account of many (they declined to say how many) cores.
Similar situation for Tom Clancy's HAWX with even a budget AMD processor beating the i7-870 and the new Thuban ahead by quite a margin. All seem to hit a bottleneck at 4 cores with 2 cores being the "sweet spot".
Resident Evil 5 appears to show the Phenom II X4 635 scaling well but in fact the Phenom II X6 1090T shows that 2 cores are adequate and is consistent with the Intel CPU performance. The real reason for this discrepancy appears to be the lack of L3 cache on the budget processor and is one of the few real world cases where we see this having such a profound impact in gaming.
Test Results - Overall Gaming Performance Now we have compared differing numbers of cores, it’s worth showing the performance of the above games with all cores active but at varying resolutions to show the maximum performance that can be expected. After all, no consumer is going to purchase a CPU and then disable one or more of its cores to see how much it slows down.
All processors can run at good speeds at all resolutions. If we had not tested with different numbers of cores we would not be able to tell from the above results that a 2-core Lynnfield runs this game just as well as a 4-core one and that the AMD Phenom II X4 635 processor needs at least 3 cores to keep up. The Phenom II X6 1090T takes the performance crown again from its more expensive Intel quad core rival.
Performance is virtually identical across differing resolutions hiding the issue with a single AMD core. This is a game that will not tax even basic systems and anything more than 2 cores is wasted here.
Here the i7-870 manages to pull ahead due to some kind of bandwidth limitation. We would like to do a quad crossfire test to really strain each cpu but that will have to wait until we have more graphics cards in our test lab. This game is playable at all resolutions with any of the three processors.
We’ve done something not seen in other reviews and looked at the multi-core efficiency of the latest architectures from Intel and AMD and looked beyond the simple results of just running benchmarks at default (and sometimes overclocked) speeds.
By using the motherboard BIOS to selectively disable cores we can look at the per-core performance which gives us a much greater insight into the architecture’s potential than just interpreting the results from the more traditional benchmarks.
The release of AMD's 6-core Thuban processors marks an exciting time for PC enthusiasts. In the past the fastest AMD processor has been significantly slower than the fastest Intel processor and the only foil to that has been the price of the fastest AMD processor being a lot less than the fastest Intel one. The equation has changed with the performance of the top end processors from AMD and Intel being effectively tied with AMDs lower pricing likely to play a big part in consumer purchasing decisions. This may change if AMD think the Thuban processors are too cheaply priced or if shortages are encountered we may find retailers increasing prices as was the case when the Radeon 5800 series was first released. More likely, Intel will cut into their ample margins and lower prices now that they have a fight on their hands in their flagship categories.
AMD have made a strong play for the high end of the processor market with the release of the Phenom II X6 1090T and 1055T processors. Importantly, they have done this without charging a premium as Intel have been content to do with their "Extreme Edition" price point. The strategy of dominating the low end / mainstream market and using that as a springboard for the high end as they have done in the GPU arena with ATI may be putting them back on an equal competitive footing with Intel - something which can only be good for the consumer.
As for the Phenom II X6 1090T? It's a great product at a great price and we know from speaking with game developers that several titles due out this year will make full use of all 6 cores. If you're in the market for a new CPU then 6 cores is the way to go. For those unable or unwilling to spend a $1000 on Intel's i7-980X but would like similar performance at a fraction of the price then AMD's Thuban is the only option and, as can be seen from the benchmark results in this review, represents tremendous value for money.