In recent years data centers have become something of a ‘sweet spot’ for the technology sector, showing healthy revenue growth and attracting innovative new solutions in both hardware and software. As the use of cloud services continues to expand, and we see a general growth in both Big Data analysis and the amount of unstructured data to be handled, data centers need to look at new ways of moving their vast quantities of the data in a more resource-efficient manner, while maintaining economical use of existing hardware assets.

One of the main principles supporting the architecting of data center hardware over the last number of years has been a reliance on commodity (rather than custom) hardware. Data centers have sought, where possible, to use moderately flexible, off-the-shelf hardware, customizing functionality by using software in order to suit specific requirements.

Some degree of functional flexibility is obviously essential in data center acceleration. The real question is what tradeoffs can be justified? How much flexibility is needed, and how should it be executed? How can data centers implement functionality at an acceptable price point, while continuing to maximize (and improve) performance, efficiency, and scalability?

CPU chip blueprint
– Thinkstock / Icon_Craft_Studio

Looking beyond the CPU

To date, many data centers have leaned on semi-generalist CPU-like devices in order to fulfill these needs. After all, alongside the fact that they are widely supported by software, x86-based products such as Intel’s Xeon series have been able to provide a quite considerable degree of function flex.

The real drawback to this approach, however, has been the performance ceiling of commodity CPU-orientated hardware. A CPU can handle control-plane functions and admin/protocol instructions (plus the word/block structures found in the network, transport and higher layers) with reasonable efficiency.

However, they struggle to process bit-intensive, packet-based workloads (OSI layers four and below) in an efficient manner. CPUs strain to deliver the kind of throughput needed at the lower OSI layers. As data demands escalate, these limitations are sending power figures through the roof while having an adverse effect on performance and scalability of these systems.

Data centers need to enable flexibility while achieving the performance levels of dedicated network hardware. Of course, the option exists to create a custom, multicore CPU with built-in hardware accelerators. But this approach is likely to be expensive.

One potential solution is to develop a Network Interface Cards (NIC) card around an FPGA, which could offer the required redefinition of logical functions without the massive overhead of CPU instruction sets. In order to support custom bit-intensive tasks, however, the FPGA in question would have to offer extremely high performance. Assuming this level of performance is available, such a device could support custom bit-intensive tasks in a re-definable, flexible way, while also delivering many of the advantages of a hard-wired, dedicated, custom-built solution.

Such a NIC could remove the need for CPUs to process pipeline executions and memory accesses (a task to which they are not well suited). Instead, the FPGA could address system memory directly in order to handle protocol stack processing and physical layer translations. Compare this FPGA-based solution to most current NIC implementations, which shift processing loads for ROCE to the system software stack. This offloading of processing tasks, obviously enough, has a detrimental effect on performance and power.

As a proof of concept, Achronix has adopted exactly this FPGA-centric approach in its own NIC product, the Accelerator 6D. The FPGA at the heart of this specific NIC has a number of hard cores for memory management and L1/L2 Ethernet functions – six DDR3 controllers, two 10/40/100G Ethernet MACs and two PCIe Gen 3 controllers. As a result, this NIC can handle 100 Gbps of DDR bandwidth and 64 Gbps of PCIe bandwidth - enough to support 40GE NFV applications and high-performance OVS offload. At the same time as supporting regular networking and tunneling protocols for conventional north-south communications, this NIC can use ROCE/WARP to bypass CPU overhead for east-west transactions between servers.

In my opinion, the world of data center acceleration is moving to FPGA-based solutions. FPGAs are something of a best-of-both-worlds solution, offering versatility alongside performance, while removing any need to invest in full-custom hardware. Through the adoption of FPGA-based NICs, data center architects can continue to make the best use of their existing commodity hardware while achieving needed performance and costs goals.

Alok Sanghavi is the senior product marketing manager at Achronix Semiconductor Corporation