The most interesting parts deal with the cache coherency, flow-control, and the inter-node SMP coherency. Having servers with these processors will enable the CMT line to reach beyond the T5440 to the mid-range and possibly high-end server segments. With each processor having 16 cores and 8 threads, a 4-way server would have 64 cores and 512 threads. I would imagine that with the split coherency plane and cross-bar switch, one may be able to glue at least 4 UltraSPARC-RF processors with zero cost in specialized ASICs. This could significantly reduce the costs and complexity of such platforms up to 4 sockets. Beyond that, a specialized high-speed and low-latency interconnect would be required, as is the case on the T5440 today. It does beg the question if it's possible to create larger servers that could scale out across some standardized interconnect, perhaps Infiniband, to create modular SPARC servers capable of scaling out.
Of course a lot of details are left out in the presentations on the features of the Rainbow Falls processor. Some of the questions that come to mind are:
- The NCU is depicted in one slide. Is that the integrated 10Gb Ethernet controller?
- Is PCI-E v3.0 with PCI-IOV integrated? (This is still in the works from the PCI-SIG)
- Given the distributed and even heat profile of the die, will the processor fit into a typical air-cooled envelope?
- What process is the processor implemented in? (32nm or smaller?)
- What is the power envelope?
- Will DDR3 or FB-DIMMs be used?
- How will the cache coherency affect cache trashing in LDoms?
- How will the processor perform against T2 or SPARC64?
- How will it perform against single-threaded applications?
- What kind of clock speed would be deliverable with this architecture?
- Will this be manufactured by TI or TSMC?
