According to information we heard from multiple sources, NVIDIA recently started to open out and disclose design targets for their first "CPU" i.e. SoC codenamed "Project Denver". As always, take this rumors with a grain of salt.
The information we have at hand is that Project Denver CPU core is looking to be very much aligned with T40, i.e. "Tegra 4" i.e. Wayne. According to internal schedule, Wayne silicon is going to be taped out in the next couple of weeks, with developers getting their hands on prototype silicon in December 2011.Wayne silicon is consisted out of four ARM cores (NVIDIA is not disclosing the core type, teasing that it might be either A15 or PD) and up to 64 GPU cores.
During the same month (December 2011), NVIDIA plans to tape out the first silicon based on Project Denver, which combines up to 8-core custom NVIDIA-ARM 64-bit CPU with a GeForce 600-class GPU. The company had a lot of issues in development of a CPU and the general consensus is that NVIDIA is take a conservative approach with a single 28nm PD CPU design and the 28nm Fermi-based design, i.e. the rumored Fermi-refresh in the form of notebook and lower-end desktop GeForce 600 Series cards (remember "GeForce 300"?). The interesting bit that we heard is that Project Denver is geared towards "full PhysX support", whatever that might be.
According to another source close to the subject, target for GPU part of the silicon is "at least 256 CUDA cores" which would put the product on pair with AMD’s Trinity APU which will pair a Bulldozer-Enhanced CPU core with "Northern Islands" VLIW4 architecture and will be the key APU for AMD in 2012. Compute power-wise, NVIDIA doesn’t want to clock it to heavens’ high, but rather to squeeze each IPC (Instruction Per Clock) as possible. Still, it is realistic to expect 2.0-2.5GHz for CPU and similar clock for the GPU part, with memory controller and the rest of the silicon working at a lower rate to keep everything well fed.
Unlike AMD’s APU design, where CPU and GPU parts connect to the memory controller at the speed of DDR3 memory, Project Denver is looking for a more direct communication between CPU and GPU cores, i.e. relying on the best GPU design can offer: high-bandwidth connection. NVIDIA is not taking the conventional route with L1, L2 and L3 cache design, since the GPUs have 1TB/s+ connections to its cache memory, similar approach is rumored for Denver core design as well. Just like in the GPUs, memory controller takes the larger portion of the die and connects CUDA cores with CPU cores and CPU will have priority access to the bandwidth needed.
However, NVIDIA needs only 10-20% of the total bandwidth available for the CPU portion, keeping the GPU well fed with system memory.
The design for the motherboards for notebooks, desktops and servers is also under way, with PCIe 3.0 being touted as the main connectivity feature, alongside USB 3.0 and SATA 6Gbps. According to the information given, nV is going to attack all remaining fronts – light notebooks will be covered by lower-scaled Tegra 3 and 4 (smartphone/tablet/netbook/t’n’l notebooks), more powerful silicon will be used for lighter-use notebooks and desktops, while the server play is probably the most interesting strategy.
From one side, within one generation of PD CPU designs, NV wants to enter the blade market and offer a powerful GPU, but from another, the company wants to get rid of the demands for Intel or AMD x86 cores in their Tesla business (one Tesla GPGPU requires one x86 CPU – Xeon or Opteron for "feeding"). Removing X86 from that equation will significantly boost the profit outlook, just like AMD and Intel are enjoying charging an arm and a leg for the same silicon used in their desktop/notebook computers.