Last year AMD introduced the second generation of their A-series APU codenamed Trinity. While the mobile version was launched about a year ago, the desktop version was delayed until fall. This year Trinity is getting an incremental update called Richland. While Richland was already launched for notebooks in March, the desktop version is only debuting now.
We won’t delve into the details of the architecture behind Richland in this article, because it remains basically the same as Trinity. We covered it extensively in our preview of mobile Trinity last year. However, we will look at the specifications in detail as well as discuss some of it’s pecularities that result from it’s architecture.
This slide gives a recap of the technical characteristics of Trinity, that for the most part are still true with Richland.
The two major new features of Richland are called Hybrid Boost and Configurable TDP. Hybrid Boost aims at optimizing the decisions of the integrated microcontroller with regards to selecting the appropriate turbo frequency. Hybrid Boost incorporates the current temperature of certain regions of the chip into the decision allowing for more consistent performance according to AMD. Previously it would only use internal estimations of power draw which represented a worst case and resulted in Turbo Core backing off even when the cooling solution would have allowed for longer boosted operation.
While Configurable TDP is also a new feature of Richland, on the chips we received for testing it was basically disabled. The thing is that for it to work it needs platform support as well, which might was the reason it wasn’t available in our test setup. What it allows you to do is configure a TDP point between a minimum and maximum TDP that are set by the platform and also the specific chip that is used. In our case, minimum, current and maximum TDP would be set to the nominal TDP of the APU, thus not allowing any changes. This functionality is interesting for OEMs who want to tune their products for specific operating points.
The A10-6800K APU AMD supplied us with has a nominal operating frequency of 4.1GHz, while the GPU is clocked at 844MHz. That is only half of the story though, as the CPU has a variety of different performance states that are activated based on workload and thermal conditions. The following table lists all available P-States and the default voltages used for them. The states above the line are turbo-states, with the top frequency being used if only one core is loaded, while the second and the third are all-core turbo states that are activated as thermal levels permit. As a result in many applications the 6800K will actually clock at 4.2 ? 4.3GHz. Compared with Trinity, AMD added one additional turbo state. With Trinity, there was only one all-core turbo state and one single-core turbo state. Also the intermediate p-states are higher on Richland compared to Trinity.
Back in 2011 we criticized that Llano, was operated at rather high voltages given it was 32nm silicon. With Trinity in 2012, that didn’t change much. However, with Richland, AMD apparently introduced some aggressive optimizations that allowed them to reduce the operating voltages considerably. As we always do in our reviews, we did extensive undervolting testing of both Trinity and Richland which provided us with some interesting results. We do that using the AmdMsrTweaker tool, which I started to maintain a while ago. That tool allows us to change the operating voltage at each single P-State which is more flexible than a fixed offset set in BIOS. To establish stability, once the voltage is changed the CPU has to run through a Prime95 stability test at the respective frequency. We repeated this process until the chip became unstable. For this experiment we only considered the p-state frequencies that were programmed into the chip. Theoretically we could have tested any frequency in 100MHz increments, but that would have been rather exhaustive and not yielded significantly different results.
Similar to Llano, on Trinity we were able to reduce the operating voltages quite considerably. Remarkable though, Richland ships with stock voltages that are even lower than some of our undervolting results with Trinity, a chip that launched a little over half a year ago and shares most of it’s design. On top of that we were even able to undervolt Richland a bit. It should be noted that the potential for that is not as big as on Trinity, which shows that AMD clearly optimized the stock voltages for lower power operation.
We don’t know whether AMD simply uses more aggressive binning to come up with chips that hit lower operating voltages or whether a manufacturing improvement or a design adaption enables this improvement. However it is nothing short of impressive what was accomplished here.
We used an ASUS F2A85-M mainboard for our tests. It’s a midrange FM2 motherboard featuring the A85X chipset that was introduced alongside Trinity. Compared to the A75 which is still used for some FM2 motherboards it provides two additional SATA 6Gb/s ports. The board was straightforward to use and caused no issues. After a BIOS update it was ready for prime time with Richland as well. The upgrade was really as smooth as it can get. We didn’t test what would happen if we tried to boot Richland with an older BIOS that doesn’t explicitly supports it.
We used the same Kingston 2x2GB KHX2000C9AD3T1K2/4GX kit as we did in our Llano review. This time we had the advantage that the ASUS board even allowed to load XMP profiles, making memory configuration a bit easier. Again we ran the APU at different settings of DDR3-1333, DDR3-1600 and DDR3-1866 to see how it scales with memory.
rd: Gigabyte GA-A75M-UD2H (BIOS F3)
Trinity / Richland System:
Mainboard: ASUS F2A85-M (BIOS 5202 for Trinity, 6102 for Richland)
Ivy Bridge System:
Mainboard: Intel DZ77BH-55K (BIOS 0070)
2x2GB KHX2000C9AD3T1K2/4GX Kit
750GB Seagate Barracuda 7200.11
SilverStone Strider ST60F 600W
Windows 7 x64 SP1
Since the original launch of Llano, AMD has gone a long way to improve overclocking on the APU platform. The original Llano could only be overclocked by means of the reference clock, which opens a can of problems as it also overclocks the speed of various interfaces as well as the GPU. Later AMD introduced a K-model that is unlocked where you would simply raise the multiplier. With Trinity and Richland AMD did the same. The 6800K we are reviewing here is unlocked, allowing easy overclocking.
As easy as it is to raise the multiplier in BIOS by a few steps, there is not much headroom for overclocking on this chip, at least if we restrict us with air cooling. Even with a high-end cooler like the Noctua NH-D14 we are using, the APU tops out at around 4.6 – 4.7GHz, which is only a few 100MHz above the maximum turbo frequency. It won’t net you a lot of performance though. The reason for this is that the CPU cores would start to throttle more extensively compared to stock speed.
Even at stock speed we observed some throttling when the CPU cores are loaded with a program like Prime95. The cores would start out at 4.3GHz, then back off to 4.2GHz but after a few seconds start to clock down to 4.1GHz and even lower for brief moments. This can be observed with AMD Overdrive which displays the clock of each core or CPU-Z (if you right clock anywhere on the main page you get a regularly updated clock rate display). Given that we used a high performance CPU cooler (Noctua NH-D14), we believe this is a real issue.
The throttling can also be observed by looking at a power meter at the wall outlet. As we will show in our power consumption measurements later, when Prime95 load is started the power draw will quickly reach some maximum value and then get lower in steps of a few watts every few seconds until it stabilizes.
It should be noted that Prime stresses the CPU cores in a way that no ordinary software does. Indeed we didn’t observe throttling when running other CPU intensive software like Cinebench, WinRAR or during gaming. However, when you overclock, throttling also happens in regular application which in the end degrades performance so you either have limited gains, no gain at all or even a decline. That is why when overclocking Richland (or really any Bulldozer or Piledriver CPU), one needs to pay close attention to throttling to ensure it actually provides a meaningful benefit.
On the contrary, we found that the GPU has quite some headroom for overclocking. The GPU can be overclocked either in BIOS or via AMD Overdrive. It should only be unlocked on K-models but we weren’t able to verify this claim. We were able to overclock it up to 1013MHz which is a 20% increase. In 3DMark11, which is known to scale very well with GPU performance, this netted us a 9% increase in performance (a score of 1827 with DDR3-1866 for comparison purposes with the benchmarks that follow). We would like to note that with Trinity we were able to hit the same 1013MHz on the GPU. Given how the GPU overclocking works internally, even if Overdrive allows you to set clocks at 1MHz granularity, internally the GPU clocks are only applied stepwise. The next step would be 1085MHz, which wouldn’t run stable on either Trinity or Richland.
In general we don’t recommend overclocking AMDs APUs. The reasoning behind this is that if you really need more graphics performance you won’t get that much out of it as the integrated GPU in Richland is mostly limited by memory bandwidth. Even if you use expensive overclocker memory, you wont be able to push memory bandwidth much above 40GB/s, in more sane configurations you will hover more around 30GB/s. Even lower midrange discrete GPUs like AMDs Radeon HD 7750 or NVIDIAs GeForce GTX 650 will happily outperform the integrated GPU and are available starting around $100.
Unless otherwise noted, all 3DMark benchmarks were run at the performance preset with default settings.
In the already dated 3DMark Vantage Richland makes a good showing, delivering in excess of 5% extra performance over Trinity, which is inline with the GPU clock increase. It can be seen that on these chips performance scales very well with increased memory performance. While on Llano and Ivy Bridge the thirst for more bandwidth levels off at DDR3-1866, Trinity and Richland would hunger for more.
This test shows a similar result as the 3DMark Vantage one, albeit in a DirectX 11 setting. Intel’s last generation integrated graphics falls back compared to the DirectX 10 workload.
We also included our scores for the latest 3DMark including the tiered tests Fire Strike, Cloud Gate and Ice Storm. We only have scores of the more recent platforms in these tests. The results again show that there is very good scaling with memory bandwidth which explains why AMD pushes DDR3-2133 memory with the 6800K.
U test clearly shows AMDs weak point ? the performance of the x86 cores. With only a minor increase in clock speed Richland can only bring incremental improvments to the table, while Intel’s Ivy Bridge runs circles around it.
In the OpenGL test AMD can snap the lead again due to the much more capable integrated GPU. Similar to the 3DMark tests we can also see strong memory scaling in OpenGL.
WinRAR mostly stresses the x86 cores and the memory subsystem. Therefore the lead of Intel’s 3770K is not unexpected. AMDs APUs show memory scaling, but for some reason the DDR3-1600 setting didn’t fly too well with them.
We also ran a number of tests in actual games. Similarly to what we said with Llano we continue to believe that APUs are best suited for less demanding games. The most recent blockbusters can be played as well, but most of the time a severe reduction in quality settings and or resolution is necessary to obtain playable framerates.
Even though the APUs became more capable over the course of the last years, it is still true that in order to play at High quality settings, a 1280×720 (720p) resolution is optimal for this kind of product. When going to 1920×1080 (1080p) keep in mind that the number of pixels more than doubles or in other words, the framerate would be cut in half. To get a smooth experience it would then be necessary to reduce quality settings to Medium or Low, depending on the game. Instead of crippling the visual fidelity of games we would recommend going with a $100-120 discrete GPU that can deliver stable frame rates at high quality settings.
World of Warcraft
We set the resolution to 1280×720, all the settings to the highest possible setting (High or Ultra respectively), except Shadows, which was set to ‘Good’. We also enabled 4xAA and 8xAF. We used the Troll intro scene displayed after creating a fresh troll character as a benchmark sequence and used Fraps to measure the frames per second.
Regarding this graph a few things should be noted. As can be seen from the results Richland was actually slower than Trinity, which was not what we expected. The reason for this is not a real performance regression in Richland, but a performance regression in WoW. As a MMO, it is constantly updated and thus we have to test with different versions. Remarkably the performance in the troll intro scene stayed in line ever since 2011 (Patch 4.2) until a few months ago when we ran the Trinity tests. Richland was tested under the latest Patch 5.3 where we observed this regression. We opted to include the results because it can still be seen that Richland performance scales fine with memory. Just don’t make the mistake of comparing it directly to the other scores.
We set the resolution to 1280×720 and all the detail settings to the High preset. 4XAA was enabled as well. We used the built-in benchmark test as a performance measure.
In Dirt 3 we can observe nice generational scaling between the AMD APUs. Also additonal memory bandwidth provides a nice boost in performance on the latest generation chips.
As usual we provide power consumption measurements for different operating points where different subsystems of the APUs are stressed to it’s maximum. This is done using Prime95 for the CPU cores and Kombustor for the GPU. The results are worst case power consumption figures, that won’t be reached in most real world situations.
As we disucssed earlier in this review, there is some throttling goin on when stressing AMDs Trinity and Richland APUs to their maximum. This is why we will provide two figures. One is the very maximum that is reached right after starting the Prime95 in-place large FFT stress test. The second is the throttled power measurement which is the value it stabilizes at after about a minute of high load.
As we noted in earlier reviews, the Silverstone Strider ST60F we used for the system is not an optimal choice because it is already a bit dated and lacks the efficiency of more current power supplies. We still did the main measurements with it to be able to compare with older measurements, but due to criticism we also did a few measurements on a more recent Corsair TX650. The results were around 20W lower power consumption across the board.
Gaming Power Consumption
To give a few figures that are more inline with real world power use, we also looked at power consumption during the game tests we presented earlier. In WoW we measured power draw between 133W and 144W while in Dirt 3 we saw 158W to 162W. That is a good junk lower than the worst case measurements we have seen before.
We will close this review with a slide that shows the Richland product lineup that launches today. Due to time constraints we only reviewed the top of the line model A10-6800K which comes with the highest clock frequencies and is unlocked to boot. AMD was able to optimize the chip in some ways that allowed them to considerably lower operating voltages and allowed them to push the clock speed up compared to Trinity. This is no small feat and we would like to see AMD do similar things with their other Bulldozer-derivatives.
AMD positions Richland products in a cost-conscious segment ? price wise it competes with Intel’s Core i3 and slower i5 APUs. Whenever the GPU can contribute, AMD has a considerable advantage, even when compared against the new Haswell lineup from Intel. Most Haswell products will only come with the slower GT2 graphics bin, which we will detail in another review. In pure x86 performance, AMD is behind Intel, but this is nothing new. Since the underlying architecture didn’t change, nobo
dy exactly expected leaps in x86 performance.
There is one recurring theme in the tests that stress the GPU ? memory bandwidth scaling. As we noted on numerous occasions, Richland could profit from additional memory bandwidth by means of faster memory. However the options are limited. Officially only the A10-6800K supports DDR3-2133 which AMD themselves offer as a Radeon Memory branded product, the rest of the lineup has to make due with DDR3-1866. The architecture also contains a multiplier for DDR3-2400, which some mainboards allow to set. A quick glance at market prices tells us that DDR3-2400 modules are not much more expensive than DDR3-2133. The rising DRAM prices of the last months had the interesting effect that while mainstream modules got a lot more expensive, the same can’t be said about the more exotic ones, unless we enter the extreme overclocking territory. The alternative to feeding Richland with faster RAM is getting a faster discrete graphics card and sticking with standard DDR3-1600 memory.
It will be interesting to see what kind of approach AMD will take to feed their APUs with ever increasing memory bandwidth. With Crystalwell (the on package 128MB cache on select Haswell SKUs), Intel presented a solution that is rather expensive in terms of manufacturing and we doubt AMD will take the same approach. But going forward if the GPU units of APUs get even faster, they will need additional memory bandwidth to deliver further performance improvements.
AMD is clearly not sleeping and working hard to deliver the next iteration of their mainstream APUs ? Kaveri. As we detailed in a recent roadmap walkthrough, Kaveri is scheduled to hit the market at the end of the year in both notebook and desktop variations. Kaveri is the first incarnation of AMDs Heterogenous System Architecture (HSA), where the CPU and GPU cores share the same address space, avoiding costly copy operations that are otherwise needed. Kaveri also includes the next generation cores codenamed Steamroller which are supposed to improve IPC considerably, which hopefully for AMD can close the gap in x86 performance with Intel.
From Computex we already heard that the upcoming socket FM2+ for Kaveri will be backwards compatible with Richland and Trinity. However due to additional pins, Kaveri will not fit into existing FM2 motherboards. Considering how Llano’s FM1 was obsolete after only one generation, this is a nice improvement, but not as user friendly as the AM2 ? AM2+ ? AM3 transition.