February 8, 2025 Financial Directions

Surging Demand for AI and Expanding HBM Options

Advertisements

The world of technology is witnessing unprecedented advancements, with High Bandwidth Memory (HBM) becoming a cornerstone in today's computing landscapeThe journey to manufacture reliable 3D DRAM stacks, known for their high yields, is both intricate and financially demandingAs businesses strive to keep pace with the relentless growth of data processing needs driven by cutting-edge artificial intelligence (AI) accelerators, graphics processing units (GPUs), and high-performance computing applications, the demand for HBM has surged dramatically, making it a central focus for companies across the globe.

The explosion in the popularity of HBM is significantly attributed to the expansive development and investment in large language models, such as ChatGPTThese models demand vast amounts of data storage and processing power, overwhelming the existing HBM inventoryHBM, recognized for its ability to store a significant portion of the data required for creating these AI models, is further buoyed by the need to enhance density through additional layers and the limitations of Synchronous Static Random Access Memory (SRAM) expansion

According to Niraj Patel, Senior Vice President and General Manager of Rambus’s silicon IP division, addressing the bottlenecks of memory bandwidth and capacity is crucial to meeting the real-time performance requirements essential for AI training and inference.

The growth in HBM adoption is also empowered by advanced packaging technologies, which often provide shorter, faster, and more robust data pathways compared to traditional system-on-chip (SoC) designsKen Yang, head of investor relations at ASE, highlighted the thriving landscape of advanced packaging technology in a recent earnings callHe expressed optimism about the prospects for advanced interconnect technologies across various sectors, from AI and networking to other emerging products.

This is precisely where HBM comes into playJin Yundong, Vice President of Samsung Electronics' DRAM Product Planning, pointed out that a significant shift towards customized HBM is on the horizon to meet the growing demand for AI infrastructure, which requires exceptional efficiency and scalability

The collaboration with major clients in the industry amplifies the belief that tailored HBM solutions will be criticalKey performance metrics such as power consumption, performance, and area (PPA) are vital considerations in AI solutions, and customization promises substantial value in these areas.

Historically, the widespread adoption of HBM was hindered by economic factors—the high costs associated with silicon interposers and the expense of handling numerous Through-Silicon Vias (TSVs) on front-end production lines (FEOL). As the demand for High-Performance Computing (HPC), artificial intelligence, and machine learning skyrockets, the dimensions of interposers have significantly increasedLi Hongcao, Senior Director of Engineering and Technology Marketing at ASE, commented on how the costs associated with 2.5D silicon interposer TSV technology represent a significant drawback.

While these economic challenges limit HBM's appeal in mass-market applications, the demand remains robust in less cost-sensitive sectors such as data centers

The unmatched bandwidth of HBM compared to any other memory technology solidifies its position, with 2.5D integration using silicon interposers with microbumps and TSVs becoming the de facto standard.

Driven by clients seeking enhanced performance, HBM manufacturers are considering improvements to bumps, underfill, and molding materials, which has led to ambitious advancements in transitioning from 8-layer to 12-layer and even 16-layer DRAM modules, capable of processing data at unprecedented speedsFor instance, HBM3E modules achieve a processing speed of 4.8 terabytes per second (TB/s), with expectations for HBM4 to reach 1 TB/s by doubling the number of data lines from 1,024 in HBM3 to 2,048 in HBM4.

Currently, three companies dominate the HBM memory module market: Micron, Samsung, and SK HynixAlthough each adopts unique approaches to reliably deliver their DRAM stacks and packaging solutions for integration into advanced packages, the general methodologies diverge slightly

alefox

Both Samsung and Micron implement Non-Conductive Films (NCF) and Thermal Compression Bonding (TCB) across every bump layer, while SK Hynix continues to leverage a molded reflow process utilizing Molded Underfill (MR-MUF), effectively sealing stacks with high-conductivity molding material.

The vertical connectivity in HBM is facilitated through copper TSV and scaling microbumps between stacked DRAM chipsBuffer or logic chips supply the data path for each DRAM, and reliability issues largely stem from thermal-mechanical stress during reflow, bonding, and molding backgrind processesTo identify potential issues, rigorous testing such as High-Temperature Operating Life (HTOL), Temperature and Humidity Bias (THB), and temperature cycling are implementedAdditionally, various assessments are required to ensure that microbumps do not develop short circuits, metal bridging, or interface delamination between chips and microbumps, especially as HBM moves towards more complex versions like HBM4, which might incorporate hybrid bonding as an alternative to microbumps if yield targets can't be met.

Another groundbreaking development in this sector includes the evolution of 3D DRAM devices

Much like 3D NAND technology, these devices position storage cells sidewaysJin Yundong of Samsung indicated that “3D DRAM stacks will significantly diminish power consumption and area while eliminating performance bottlenecks posed by interposers.” Moving memory controllers from SoCs to the substrate itself will open up much-needed logical space for AI functionalitiesThe consensus is that customized HBM will usher in a new era of performance and efficiency, with tightly integrated memory and processing capabilities accelerating time-to-market for large-scale implementations and ensuring the highest quality standards.

The overarching trend advocates placing logic components closer to memory, allowing for more processing to occur within or adjacent to memory instead of transporting data to one or several processing elementsThis approach, while appealing, introduces complexities that are far from straightforward from a system design perspective.

As noted by CheePing Lee, Senior Director of Advanced Packaging Technology at Lam Research, “This is a thrilling time; AI is booming, and HBM is at the core of it all.” He illustrated the race among memory manufacturers to be the first to produce the next generation of HBM, specifically HBM4, as the JEDEC organization is currently specifying standards for these next-level memory modules

Concurrently, JEDEC has expanded the maximum memory module thickness for HBM3E from 720 mm to 775 mm, while still accommodating active chip sizes of 40 micrometersThe HBM standards outline critical parameters including transmission rates per pin, maximum chip counts per stack, packaging capacity (in GB), and bandwidth ratiosThe associated design and manufacturing streamlining processes are poised to expedite the market entry of HBM products, with new generations emerging every two yearsThe upcoming HBM4 standard is expected to define configurations for 24Gb and 32Gb layers with 4-high, 8-high, 12-high, and 16-high TSV stacks.

The evolution of high-bandwidth memory traces back to research and development efforts initiated in 2008, aimed at addressing the challenges of increasing power consumption and physical footprint seen in computing memoryAt that time, GDDR5 was the pinnacle of DRAM, with speed limitations maxing out at 28GB/s (7Gbps/pin x 32 I/O). HBM Gen2, however, escalated the I/O count to 1,024 while lowering frequencies to 2.4Gbps, achieving a staggering rate of 307.2GB/s

With HBM2E, advancements included cutting-edge 17nm high-k metal gate technology, offering 3.6Gbps per pin and a throughput of 460.8GB/sToday, HBM3 proudly delivers 6.4Gbps of transmission speed per pin and up to 12-chip stacks, nearly doubling its bandwidth compared to predecessor generations.

As processes in HBM evolve, they increasingly unify with processing technologies, unlocking various processing options along the wayReflow soldering remains the most established and economically viable soldering methodCurtis Zwinge, VP of Engineering and Technology Marketing at Amkor, mentioned that due to the massive capital expenditures involved and lower inherent costs, reflow soldering is preferred whenever feasibleHowever, as performance expectations rise and heterogeneous integration (HI) modules and advanced substrates are introduced, increased warpage presents an evolving challengeInnovations such as Thermal Compression Bonding and Reverse Laser-Assisted Bonding (R-LAB) emerge as process enhancements to adeptly handle the greater warpage observed in HI modules and packaging layers.

Microbump metallization techniques have been optimized to bolster reliability

When interconnections utilize conventional reflow soldering processes with solder paste and underfill for fine pitch applications, underfill voids and residual solder paste debris can lead to encapsulated bumpsTo mitigate these challenges, pre-coated NCF can replace solder paste, underfill, and bonding methods, realizing a one-step bonding process devoid of underfill voids and residual solder paste remnants.

Samsung’s utilization of thin NCF for thermal compression bonding in their 12-layer HBM3E enables performance specifications akin to an 8-layer stack, achieving impressive bandwidth rates of up to 1,280 GB/s and capacity of 36 GBNCF is essentially a resin that includes curing agents and various additivesAs the industry pushes to reduce warpage from thinned die, Samsung incrementally increases the thickness of NCF material in each product generation, ensuring complete filling of the underfill cavities around the bumps to optimize solder flow and eliminate voids.

SK Hynix executed a pivotal transition from NCF-TCB to reflow-molded underfill processes during the production of its HBM2E generation

This new conductive molding material, developed in collaboration with material suppliers, incorporates a proprietary injection molding technique and demonstrates lower transistor junction temperatures achieved through the reflow soldering process.

In the landscape of HBM, DRAM stacks are mounted upon buffer chipsAs companies strive to implement additional logic functionalities atop the substrate to lower power consumption, the capabilities of buffer chips are continually evolvingEach chip is picked and placed onto a carrier wafer, with solder being reflowed as the ultimate stack undergoes molding, backside grinding, cleansing, and die cuttingTSMC and SK Hynix have announced that this foundry will eventually supply the base chips for this memory manufacturing company.

The industry holds a keen interest in logically integrated memory solutions, as noted by Sutira Kabeer, Director of Research at Synposys

He remarked that while integrated logic research was previously explored, it remains a viable considerationYet, each approach presents challenges related to power consumption and heat dissipation, with thermal stresses directly impacting mechanical stresses beyond mere assembly levelsFurthermore, using hybrid bonding or advanced fine-pitch bonding may induce varying repercussions on mechanical stresses due to thermal considerations.

Thermal energy emitted from underlying logic chips can create thermal mechanical stresses at the interface between the logic chip and DRAM chip oneDue to the physical proximity of HBM modules to processors, the heat produced by logical chips inevitably transfers to the memorySK Hynix’s Senior Technical Manager, Yoon Soosoo, cited data indicating that a mere 2°C rise in the main chip temperature could elevate HBM temperatures by an additional 5°C to 10°C.

Other challenges are to be solved through NCF TCB processes

The high-temperature and high-pressure thermal compression bonding processes could elicit issues during 2.5D assembly, such as metal bridging between bumps and underlying nickel pads or interface delaminationMoreover, TCB represents a low-throughput process that may hinder production efficiency.

For any multi-chip stacking, warpage issues are closely tied to the relative coefficients of thermal expansion (CTE) mismatch, generating stress during temperature fluctuations in processing and utilization stagesStress often concentrates at critical junctions—namely, between the base chip and the first layer of memory chips and within the microbump levelsProduct models with simulations aid in tackling such issues, albeit the complete scope of these challenges may remain undetected until the finalized product emerges.

Artificial intelligence applications rely on the effective assembly and packaging of multiple DRAM chips, TSVs, and potentially base logic chips containing driver circuitry, alongside upwards of 100 decoupling capacitors

Share:

Leave a Reply