Optimizing for Performance
Getting the best out of existing resources
For a few years, I have been an avid cyclist, and often out for a long spin. However, I started experiencing upper-body discomfort on long rides, so I decided to consult a professional bike fitter. A bike fitter is a specialist who adjusts a bike's parameters by measuring the rider's body, assessing their flexibility, and filming the rider as they ride. After collecting all the data, the fitter recommends adjustments to standard parameters, such as seat height, handlebar position, cleat placement, and more, to achieve ultimate comfort and efficiency. As the specialist fine-tuned my bike, I was reminded of my days in semiconductor manufacturing, where my colleagues and I frequently used Design of Experiments (DoE)—a powerful statistical method for analyzing controlled tests to understand how different parameters affect an output. The semiconductor industry is built on continuous optimization and in maximizing existing resources. This optimization process is now being elevated to new levels with the help of AI.
Chiplets: Heterogeneous Packaging of Disaggregated Silicon
A recent EE Times conference on Chiplets (The Future of Chiplets) highlighted how different players in the semiconductor industry approach the same problem and optimize it according to their needs. The objective of the conference was to understand the current state of the Chiplet ecosystem and how to make it accessible to all in the future. It was an insightful forum, with a focus on design and architecture. Multi-die systems with heterogeneous integration, also known as Chiplet, are already in commercial devices, driven by the insatiable performance demands of AI models and the slowing of Moore’s Law.
The benefits of a chiplet system are well-known:
Chiplets are common in data servers; however, they are currently designed by the same companies that design CPUs and GPUs. The ultimate goal is to create a "Chiplet-Market-Place," a multi-vendor ecosystem where any company, regardless of its resources, can select specific I/O chips, memory, CPUs, and GPUs to build a custom multi-chip processor for its unique applications.
Challenges for a Multi-Vendor Ecosystem
Achieving this democratization of the semiconductor industry is a complex process. Most chiplet designs utilize the ARM architecture and combine chips of various sizes, each made with a different process node, to create a sophisticated system. For a chiplet marketplace to become a reality, several hardware and software issues must be addressed.
Key challenges include:
Recommended by LinkedIn
The Industry's Path to a Multi-Vendor Ecosystem
At the conference, companies including ARM, Synopsis, Cadence, and Siemens discussed their tools and readiness to custom-design chiplet processors for AI. Since no discussion of AI hardware is complete without mentioning memory, several solutions were proposed. Alphawave suggested that in High-Bandwidth Memory 4 (HBM-4), the die-to-die (D2D) interface will be integrated directly into the HBM DRAM stack's base logic die and the compute die. This base logic die will gain additional functions and utilize a finFET process, necessitating collaboration between memory and logic manufacturers. These changes aim to reduce latency and increase bandwidth. Arteris presented its platform for managing Non-Uniform Memory Access (NUMA) and cache coherence in chiplets. These are critical concepts in modern multiprocessing architectures. NUMA scales the number of processors without overloading a single memory bus, allowing a processor to access memory attached to another CPU, albeit with higher latency. This is only effective if the interconnects are efficient. Cache coherence ensures that all processors have a consistent view of the data stored in each cache. Two companies, Ellyan and d-Matrix, are exploring alternatives to HBM on silicon interposers by stacking memory dies on organic interposers. This could lower costs and increase the number of memory stacks connected via a fan-out process. d-Matrix also proposed using its own bridges to connect packages for extremely low-latency communication.
Standardizing interfaces and using memory effectively were dominant themes in the conference. With multiple chips needing to work together, IO chiplets are becoming more critical. Companies like LightMatter, Xcapephotonics, and Ayar labs presented their optical I/O solutions. Ayar Labs identifies the laser unit as the main barrier to adopting co-packaged optics and proposes an optical I/O chiplet (TeraPHY) with an external laser source. This design allows a rack to have multiple optical I/Os connected to a single 16-wavelength light source. Lightmatter believes chip shorelines are a key limitation and advocates a 3D stacked approach (GEN 4 3D interposer M1000). Xcapephotonics argues that current laser arrays cannot scale in power and bandwidth and is developing a Si-photonic chip (CHROMX Platform) that can provide 128 wavelengths from a single chip. This chip contains eight laser units, each emitting sixteen wavelengths for efficient wavelength division multiplexing (WDM).
Other participants included MIPS Tech, which enabled a chiplet design platform based on the RISC-V architecture, and Athos Silicon, which is applying the chiplet concept to the automotive industry. For automotive applications, where reliability is paramount, Athos Silicon suggests producing critical chiplets in multiple quantities within a system, allowing workloads to be rerouted if one fails. Sagence AI and PseudolithIC are creating analog devices using chiplet concepts. PseudolithIC fabricates transistors in III-V compounds and places them into trenches on a silicon wafer before completing the standard Si-CMOS processing. This fabrication process is an alternative to growing GaN on Si.
An Emerging IP Field
The new paths emerging in the chiplet ecosystem suggest a vibrant patent landscape. Creating a patent landscape for chiplets is challenging because the concept is a subset of System-in-Package (SiP), and most 2.5D and 3D packaging patents also apply to the chiplet ecosystem. The idea of disaggregating silicon by separating I/O, CPUs, and memory gained traction around 2020, initially with major players like Intel, TSMC, and Nvidia. Over the past two years, the focus has shifted to design and architecture, attracting new players like EDA tool vendors and photonics companies. It remains a nascent field.
Improving the Status Quo
Many players are collaborating to bring the Chiplet-Market-Place concept to reality. With so many contributors, hardware, software, and design processes will likely be optimized soon, paving the way for new AI applications. In the future, when buying my next bike, an AI platform will maybe measure my body dimensions, take into account my preferences and instantly generate the specifications for a bike that will perfectly fit me. This system would replace the linear, "one factor at a time" adjustments conducted by humans today; such a platform will probably be running on a Chiplet-Market-Place system. The crux of human evolution is based on the process of monitoring and optimizing the world we live in, whether it be society, technology, the environment, or personal life. Progress is always a source of encouragement, so let’s continue to ride towards the future…