The data-centric era of the cloud
From an industry perspective, cloud encompasses two sets of players. One set provides digital services ranging from eCommerce, search to digital media. They are born in cloud companies like Facebook, Netflix. The other set provide IaaS/PaaS, which include Amazon Web Services, Azure and Google Cloud.
Designing cloud systems for IaaS/PaaS is very different from SaaS. Traditionally to develop cloud for SaaS, companies follow two approaches. One approach is to develop solutions in tiered fashion by partitioning infrastructure into components (like storage, database, compute etc.) based on interaction of applications with underlying architecture. All the components required to deliver the software service are deployed in optimized configurations at hyper scale. The other approach is hyper converged in which the primary service takes precedence over others. For example, Search being difficult to optimize becomes the primary function to optimize systems. Same systems are used to support other services which are not performance critical.
In many SaaS deployments, virtualization overhead can be avoided. Hypervisor has performance implications for compute and IO. For IaaS, each tenant expects a virtual private network. To enable this capability, system should support packet encapsulation, encryption, tunneling and other packet processing which traditionally consume compute resources.
For IaaS/PaaS, the revenue is directly associated with HW compute instances CSPs are able to provide to customers. Due to direct impact on revenue, many IaaS providers are adopting new acceleration approaches which deviates significantly from the standardized approach.
With advent of deep learning, there is a deluge of data flowing into the cloud. CPU performance CAGR has not been able to suffice the compute needs to gather insights and bring economic value. With economies of scale in cloud, deployments now have function acceleration for tasks using FPGA or ASIC solutions rather than relying on general purpose CPU cores. From offloading specialized compute workloads to accelerators to having inline computational storage, there is rapid transformation in how cloud is growing. In most data centers a couple of years ago, the CPU processor was the heart of the system, connecting memory, acceleration and devices. CPU has been traditionally acting as a glue combining devices and heterogeneous accelerators with general purpose compute. It acted as control plane managing data flow. Given the inherent limitations of CPU SoC and ever growing data, many cloud providers are rethinking data flow to improve performance and efficiency.
There is a rise in usage of accelerator deployment ranging from AI interference, training acceleration to packet processing. Because of the performance benefits, cloud architecture is bringing compute near data rather than the traditional approach of bringing data to compute. This is the data-centric transformation which is in play. From IaaS perspective, offloading infrastructure to accelerators is critical to make more cores available to customers for monetization. Similar opportunities are set to be leveraged by other players in cloud space.
Bringing data to compute or taking compute to data - defines whether the architecture would be compute-centric or data-centric. Today, more and more applications are data-centric in nature. Formulation of microservice based SW architecture, led by Amazon and now getting adopted by many cloud companies, has led to solutions where data-centric architecture has significant performance upsides. CPU has inherent limitations with data movement. Cache hierarchy and memory system adds latency. We have accelerators like DMA to facilitate data movement. At system level, workloads with high IO/arithmetic ratio can be accelerated to free up cores to do more arithmetic intensive sequential activities.
In personal computer space, we have seen new manufacturing process and architectural improvements in CPU, laying the foundation for new use-cases and applications. CPU performance improvement is directly related to bottom line in terms of economic value. In data center context, CPU performance improvement is still important but many other system architecture aspects are growing in importance to drive performance and value.
With cloud becoming the new computer of the masses, it would be interesting to see how it evolves.
Abbreviations used:
HW - Hardware
IaaS - Infrastructure as a Service
PaaS - Platform as a Service
SaaS - Software as a Service
IO - Input Output like PCIe
DMA - Direct Memory Access accelerators
AI - Artificial Intelligence
SoC - System on Chip
CPU - Central Processing Unit
CAGR - Cumulative Aggregate Growth Rate