As the Fusion 11 conference approaches, more and more Llano scores are being leaked online. In a very detailed review, Chinese website Coolaler unveiled what people can expect from Fusion APU, a project AMD hedged its future.
How AMD is positioning Fusion against Intel’s Sandy Bridge architecture: Quad-Core A8 goes against Dual-Core i3-2000 Series
Some suspicion naturally comes from the fact that Llano APU processors are based on combining two, three or four processing cores from "Shanghai", known to our readers as K10.5. These cores were literally blown away by the IPC efficiency shown by Core i5 and i7 generations of processors from Intel (45nm Nehalem, 32nm Westmere, and 32nm Sandy Bridge). However, AMD has a hidden card in their sleeves, and that is the Radeon GPU architecture. As we all know, Intel’s integrated graphics has its fair share of issues, from DirectX compliancy to the fact that Intel outsources driver development to a 3rd party team.
AMD’s GPU inside the APU hardware packs only "the latest and greatest", i.e. a DirectX 11.0, OpenGL 4.1, Open CL 1.0 compliant graphics processing cores that are capable of running all the bells and whistles that applications might throw at them without "shimmering" textures and other artifacts.
AMD Phenom on the left, Fusion A8 on the right – As you can see, AM3 and FM1 processors are identical in size
Unlike previous leaked results, Coolaler got their hands on allegedly shipping hardware, i.e. a Fusion A8-3800 processor ticking at 2.4GHz featuring a four "K10.6" CPU cores and four hundred "Evergreen" GPU cores (Radeon HD 6550D). We’re writing K10.6, since these CPU cores are not exactly Shanghai (K10.5), given that they’re manufactured in 32nm, instead of 45nm used on Shanghai processors. Also, Shanghai core carries 128KB of L1 and 512KB of L2 cache, while Fusion APU features 128KB of L1 and 1MB of L2 cache. Do note that Fusion CPU cores cannot rely on L3 cache, since AMD decided to remove it in order to make room for GPU cores.
Fusion A8-3800 is the baseline model of the series, and ticks at 2.4GHz, reaching 2.7GHz when Turbo mode is activated. According to unofficial information from AMD, Turbo 2.0 will overclock either CPU or GPU cores, depending on a workload. The GPU cores operate at 600MHz by default, but you can be fairly flexible during overclocking.
GigaByte GA-A75-UD4H Motherboardfeatures Socket FM1, home for 32nm "Llano" and future APUs
Coolaler paired this A8-3800 with Gigabyte’s A75-UD4H motherboard, only 4GB of DDR3-1600 memory (operating at 1333MHz), 1TB Seagate Barracuda 7200.12 hard drive and powered by Thermaltake’s 1.35kW power supply. 1.35 Kilowatts of juice came in handy when the reviewer pushed the GPU to the maximum stable clock.
The results? It took 54.38 seconds to calculate SuperPI 1M, while Phenom II X4 830 needs only 24.25 seconds. Truth to be told, AMD Phenom II X4 830 ticks at 2.8GHz. However, CPU performance is dauntingly low. We took a trip down the memory lane and found our old performance charts. In them, eight year old AMD Athlon 64 2.4GHz using old Socket 754 (remember that one? Single-channel 64-bit DDR1-400 memory) needed only 35.91 seconds to complete the same benchmark! We don’t know what AMD did, but SuperPI values are quite shocking. AMD says A8-3800 will compete against Core i3-2100T, which solves the one million iterations (1M) test in only 12.45 seconds.
Cinebench R11.5 turned things upside down, though. AMD Fusion A8-3800 scores 2.89fps (Phenom II X4 830: 3.38 fps), with the GPU scoring pretty good 28.94 fps. Bear in mind that top-of-the line GPU, discrete AMD Radeon HD 6970 2GB scores 55.03fps. Competing Intel Core i3-2100 (2.5GHz) will score about 2.42fps, meaning four 2.4GHz AMD "K10.6" scores are faster than two 2.5GHz Sandy Bridge cores with Hyper-Threading enabled (four threads).
According to AMD, Llano supposedly brings 27GB/s of system memory bandwidth, with the GPU connecting using the 27GB/s bridge as well
In terms of CPU memory bandwidth, enthusiasts have criticized AMD ever since the company launched the K10 architecture. While K8 was beating the living daylights out of Intel Pentium and even later, Intel Core 2 series of processors, K10 (Barcelona) introduced L3 cache and AMD controlled L3 cache through the memory controller. The results were quite catastrophic, with L3 cache bandwidth and response time dropping below what Intel Core i7 receives from System memory. L3 cache bandwidth also killed the system memory performance, with efficiency dropping from 91-93% on K8 to lowly 60-70% on the K10 and K10.5. In fact, we believe that data starvation is the key reason why contemporary CPU cores from AMD falled behind Intel, who has dozens of gigabytes per second available from L2 and L3 cache.
Fusion APU should bring in a change of fortunes for AMD, since there was no more L3 cache to worry about and the dedicated Northbridge controller would dedicate to feed the CPU and the GPU to the best of its abilities. In the slide above, you can see that AMD proclaimed that a Dual Channel DDR3-1866 controller would bring 27GB/s, and that the GPU is connected to the memory controller on a 27GB/s wide pipe. On tested system, the memory operated at 1333MHz, meaning that the maximum achievable bandwidth would be 20.82GB/s.
However, BIOS dedicated one 64-bit memory channel to the GPU, and you could select up to 1GB of system memory to the GPU at boot. This meant four CPU cores can get a maximum of 10.41GB/s, and 400 GPU cores would get another 10.41GB/s. Given that 400-core Radeon HD 5560 had 512MB GDDR5 memory and a bandwidth of 59.38GB/s, you can see why so many analysts had concerns about performance of first generation Fusion APUs.
We have contacted AMD representatives about discussing Fusion APU Architecture in order to understand the reasons why AMD had to make the design calls the company did, but we received no answer at press time. At Bright Side of News*, we’re completely open for cooperation with companies in order to provide our readers with the best possible information and we do not have bias against either side. Our primary interest is providing proper information to consumers.
AMD Fusion A8-3800 APU AIDA64 Results
So, what the results were? On maximum of 10.41GB/s for the CPU, AIDA64 gave out 8.40GB/s read, 6.01GB/s write and 8.95GB/s. While this is quite good efficiency on percentage base, this is a far cry from 20GB/s. Bear in mind that AMD places this APU against Intel Core i3-2100 Series, which scores 17.49GB/s read, 12.98GB/s write and 18.56GB/s in copy tests.
3D Tests: Fusion’s Brilliant little GPU that could
However, as we have said while looking at Cinebench 11.5, AMD built in a brilliant GPU inside, and even the memory bandwidth starvation could not withheld 400 cores from achieving P4019 in 3DMark Vantage (DirectX 10) and E1764 in 3DMark 11 (DirectX 11). For comparison, Intel Core i3-2100 scores P1319 in 3DMark Vantage and is not able to run 3DMark 11 due to lack of support for DirectX 11.
Therefore, in 3D testing, AMD Fusion A8-3800 wipes the floor with Core i3-2100. If Cinebench and 3DMark were the only measurements, A8-3800 would level even the much more expensive Core i7-2600K. However, buying a 2600K CPU and using integrated graphics is considered sacrilege by BSN* team members.
In gaming tests, A8-3800 achieved impressive 34fps in H.A.W.X. 2 DirectX 11 test. If you want older APIs, DirectX 9 mode gives a very good 46 frames per second.
All in all, we invite you to visit Coolaler’s excellent first review, and see all the test results for yourself. As we don’t have the hardware on our hands you’ll just have to go with Coolaler on this one. We’re working on acquiring the Fusion A4, A6 and A8 APUs on our own, in order to conduct our own, independent review.