Home » Blog » The Rise of High Bandwidth Memory (HBM): Revolutionizing GPU Performance The Rise of High Bandwidth Memory (HBM): Revolutionizing GPU Performance

Introduction

Graphics Double Data Rate (GDDR) and High-Bandwidth Memory (HBM) stand at the forefront of Video RAM (VRAM) technology, catering to the ever-growing demands of high-performance GPUs. While both offer crucial memory solutions, they differ significantly in bandwidth, power consumption, and accessibility, with HBM emerging as a game-changer in CPU/GPU memory chips.

 

GDDR, renowned for its high bandwidth, has long been a staple in the GPU market. However, its propensity for power consumption poses challenges, particularly in energy-sensitive applications. In contrast, HBM represents a paradigm shift, boasting even greater bandwidth and significantly lower power consumption, rendering it an optimal choice for cutting-edge graphics cards.

 

HBM-equipped GPUs, characterized by their superior performance and efficiency, occupy a distinct niche in the market. Found primarily in flagship accelerators such as the H100 and A800 40GB Active and other data center GPUs, these cards command a premium due to their advanced capabilities. Notably, integrating memory directly onto the GPU die ensures uniform capacities across GPUs, albeit with the ability to disable certain portions as needed.

What is HBM?

HBM, short for High Bandwidth Memory, represents a groundbreaking innovation in CPU/GPU memory chips. Unlike traditional DDR memory, HBM adopts a vertically stacked architecture, akin to a “skyscraper design” compared to DDR’s “single-story house” layout. This configuration allows for greater capacity and bandwidth, revolutionizing performance capabilities.

 

To elaborate, HBM integrates multiple DDR chips stacked together and packaged alongside the GPU, forming a high-capacity, high-bandwidth DDR array. Take AMD’s latest MI300X GPU chip layout as an example, with the GPU positioned in the center and four small dies on each side representing the stacked HBM DDR chips. Presently, GPUs commonly feature stacks ranging from 2 to 8 layers, with the highest stacking reaching 12 layers.

 

One might question whether HBM’s distinction from DDR—a shift from “single-story” to “skyscraper” architecture—constitutes true innovation. In reality, the production of HBM entails intricate complexities. Constructing a skyscraper is inherently more challenging than building a single-story house, requiring comprehensive redesigns from foundational groundwork to wiring. Similarly, HBM’s architecture necessitates redesigns in signal transmission, instruction handling, and power efficiency, demanding elevated standards in packaging technology.

 

As depicted in the illustration, HBM stacks DRAM chips using Through-Silicon Via (TSV) technology, facilitating connections between dies. Below the DRAM lies the logic control unit, responsible for DRAM management. The GPU and DRAM are interconnected via uBump and an Interposer—a silicon chip enabling interconnectivity functions. The Interposer, in turn, links to the Substrate (packaging substrate) through the Bump, ultimately connecting to the Ball Grid Array (BGA) balls attached to the PCB.

 

HBM stacks are tightly interconnected through an intermediary layer, boasting almost the same characteristics as chip-integrated RAM. This setup enhances IO count significantly. Furthermore, HBM optimizes memory power efficiency, delivering over three times the bandwidth per watt compared to GDDR5, effectively reducing power consumption by over threefold. Moreover, HBM’s space-saving design reduces the footprint by 94% compared to GDDR5, offering gamers an efficient solution without bulky memory modules. Given its technical complexity, HBM is the flagship product showcasing storage manufacturers’ technological prowess.

Why HBM for GPU?

The primary goal of HBM is to provide GPUs and other processors with expanded memory capacities. As GPUs grow increasingly powerful, the need for faster data access from memory becomes imperative to reduce processing times, especially in AI and visual applications with demanding computational and bandwidth requirements. Overcoming the “memory wall” is crucial, making memory bandwidth enhancement a pivotal focus in chip storage development.

 

Advanced semiconductor packaging presents an opportunity to address memory access barriers hindering high-performance computing applications. Challenges related to memory latency and density can be resolved at the packaging level. Through heterogeneous integration routes, memory designers can incorporate more memory closer to the processor, addressing contemporary memory challenges faced by modern processors and embedded systems.

 

Upon stacking, the interface widens significantly, with the number of interconnect points beneath surpassing those of DDR connected to the CPU. Consequently, compared to traditional memory technologies, HBM offers higher bandwidth, increased IO count, lower power consumption, and a smaller form factor.

 

HBM Generations.

Currently, HBM products are evolving in successive iterations: HBM (1st generation), HBM2 (2nd generation), HBM2E (3rd generation), HBM3 (4th generation), and HBM3E (5th generation). Each iteration brings improvements in processing speed. For instance, the pin data transfer rate of the first-generation HBM at 1Gbps has evolved to 8Gbps in the latest HBM3E, capable of processing 1.225TB of data per second, enabling lightning-fast data processing.

 

Moreover, memory capacities continue to expand, with HBM2E offering a maximum capacity of 16GB. Utilizing its fourth-generation Extreme Ultraviolet Lithography (EUV) process node (14nm), Samsung is currently manufacturing 24GB HBM3 chips. Stacking 8 or 12 layers on HBM3E can also achieve a staggering 36GB capacity, 50% larger than HBM3, making it the largest in the industry.

 

Previously, SK Hynix and Micron announced the launch of HBM3E chips, capable of delivering over 1TB/s bandwidth. Concurrently, Samsung has unveiled plans for HBM4 memory, incorporating advanced chip manufacturing and packaging technologies. While the specifications for HBM4 remain uncertain, industry sources indicate a move towards a 2048-bit memory interface and FinFET transistor architecture for power efficiency. Samsung aims to upgrade wafer-level bonding technology from bump-based to bumpless direct bonding, potentially elevating the cost of HBM4.

HBM Market Status

Amidst the technological revolution spurred by generative AI, the High Bandwidth Memory (HBM) market has emerged as a dynamic force, reshaping the landscape of high-transmission storage technologies and drawing significant attention from industry giants navigating through shifting market trends. This phenomenon reverberates prominently in recent earnings discussions, underscoring the pivotal role of HBM in the current market milieu.

 

Yole Group’s latest analysis report forecasts a substantial uptick in HBM’s share of total DRAM shipments, fueled by the burgeoning demand for artificial intelligence servers, projected to catapult from 2% in 2023 to 6% by 2029. Despite HBM’s premium price tag compared to DDR5, its revenue is anticipated to skyrocket from $14 billion in 2024 to a staggering $38 billion in 2029, marking a remarkable escalation from approximately $5.5 billion in 2023, with an annual surge exceeding 150%.

Acknowledging this growing demand, memory suppliers have ramped up HBM wafer production, with estimated output escalating from 44,000 wafers per month (WPM) in 2022 to 74,000 WPM in 2023, poised to reach 151,000 WPM by 2024.

 

In the fiercely competitive HBM market, dominated by a triumvirate of major players, the disparity is stark, with the leading contenders cementing their positions while others struggle to catch up. SK Hynix leads the charge, commanding both technological prowess and market share, reaping substantial profits. Samsung follows suit, aggressively expanding its market footprint, while Micron, trailing behind, grapples with a smaller market share stemming from strategic missteps, currently endeavoring to bridge the gap but facing challenges in yielding significant short-term profits.

 

Recent revelations affirm this divergence in performance. SK Hynix’s confirmation of record-breaking HBM sales in recent months, propelling profitability in the fourth quarter, underscores the company’s dominance. Vice President Kim Ki-tae highlights generative AI services’ rapid proliferation and evolution, driving exponential growth in HBM demand as the preferred AI storage solution.

 

Of note is SK Hynix’s sell-out of HBM inventory for the year, signaling robust market demand even in the early stages of 2024. As SK Hynix strategizes for market sustenance, plans for 2025 are already underway. As the year unfolds, SK Hynix sets its sights on the HBM market for 2025 while competitors Samsung and Micron confront mounting pressure. Amidst their pursuit of technological advancements and market dominance, they contemplate alternative strategies to penetrate the AI market, bypassing conventional HBM technology.

 

Research firm TrendForce underscores the escalating demand for HBM, with a projected near-60% growth in 2023 and an additional 30% surge anticipated in 2024. While supply shortages may persist in 2023, a potential equilibrium in supply and demand is expected by 2024.

 

Despite its meteoric rise, HBM still occupies a relatively modest share of the overall storage market, remaining a niche product. The competitive landscape is currently delineated by SK Hynix, Samsung, and Micron, with SK Hynix poised to further extend its market dominance by 2023.

 

Future Potential of HBM

The technological innovation landscape, marked by the rise of AI large models and intelligent driving, is propelling the demand for high-bandwidth memory (HBM) to unprecedented heights.

 

Primarily, the demand for AI servers is poised for a remarkable surge over the next two years, with visible momentum already shaping the market landscape. These servers, equipped with GPUs capable of processing vast datasets within tight timeframes, drive substantial increases in data processing and transmission rates. Consequently, the quest for enhanced bandwidth has rendered HBM virtually indispensable in the realm of AI servers.

 

Beyond AI servers, the automotive industry presents another fertile ground for the application of HBM. With vehicles increasingly brimming with cameras, each generating staggering data rates, the need for swift data transmission within vehicles is paramount. Herein lies HBM’s prowess, offering substantial bandwidth advantages to facilitate seamless data flow.

 

Furthermore, the realms of Augmented Reality (AR) and Virtual Reality (VR) stand as promising domains for HBM’s future dominance. These immersive technologies demand high-resolution displays, necessitating heightened bandwidth for efficient data transfer between the GPU and memory. Moreover, the real-time processing demands of AR and VR underscore the critical role of HBM’s ultra-high bandwidth.

 

Moreover, the perpetually growing demand for smartphones, tablets, gaming consoles, and wearable devices underscores the need for advanced memory solutions. As these devices evolve to accommodate ever-increasing computational requirements, HBM emerges as a pivotal enabler of their performance. Additionally, the advent of disruptive technologies like 5G and the Internet of Things (IoT) further amplifies the call for HBM’s capabilities.

 

As the AI revolution gathers momentum, the prominence of HBM is anticipated to intensify further in the future. According to forecasts by semiconductor-digest, the global high-bandwidth memory market is poised for exponential growth, projected to soar from $293 million in 2022 to $3.434 billion by 2031. With a compound annual growth rate of 31.3% anticipated during the forecast period spanning from 2023 to 2031, HBM is positioned as a cornerstone technology driving innovation across diverse sectors.

 

PS: In the market, we can not buy or sell GPU memory independently like the traditional computer memory RAM, e.g. sell RAM.