Chip Giants Unite to Exploit Nvidia Vulnerabilities
Advertisements
In the ever-evolving world of semiconductors, one name stands out for his significant contributions: Jim KellerRecently, Keller, who now serves as the CEO of Tenstorrent, provided an intriguing insight during an interview that has caught the attention of the tech communityHe stated that Nvidia has not fully addressed many market needs, thus opening the door for Tenstorrent and other innovative AI processor companies to seize the opportunityKeller, a legend in chip design, boasts a career that includes influential roles at industry giants such as AMD, Intel, Apple, and Tesla, giving him a unique perspective on the market dynamics.
Keller's career is nothing short of remarkableHe spearheaded the development of the K7 and K8 architectures at AMD, which were foundational for the Athlon series processors during the late 90s
Later, between 2008 and 2012, he played a crucial role at Apple, leading the development of the A4 and A5 chips that powered various Apple devicesHis tenure at AMD was marked by projects including the K12 Arm initiative and the innovative Zen architectureFrom 2016 to 2018, he contributed to Tesla's FSD autonomous driving chips, and also participated in mysterious projects at Intel up until 2020.
At Tenstorrent, Keller is steering the development of affordable AI processors, positioning them as an alternative to Nvidia's pricey GPUs, which can cost upwards of $20,000 to $30,000 per unitAccording to Tenstorrent, their Galaxy system boasts three times the efficiency of Nvidia's DGX while being 33% cheaperWhile creating high-performance AI application processors is a significant part of Tenstorrent's mission, the company is primarily focused on addressing market gaps that Nvidia has overlooked, especially in the growing sector of edge computing.
The rise of edge computing is pivotal in shaping the future of AI
- Common Psychological Pitfalls for Investors.
- Tongcheng Targets International Expansion After $8B Valuation
- Can A-shares Reach New Heights with Expected Rate Cuts?
- Global Chip Spending Jumps 31% in Q4
- How Billionaires Navigate the U.S. Tax System
With the relentless growth of data and the increasing demands for real-time processing and security, traditional data centers are falling short of market and customer expectationsBusinesses are compelled to seek efficient software and hardware solutions that enhance operational efficiency while reducing costsEdge-to-cloud solutions that facilitate AI workloads at the edge of the network, closer to data generation points, are crucial for applications that require almost real-time performanceThis shift allows for local processing of data and algorithms, minimizing the delays associated with transmitting workloads to distant clouds or data centers.
The advancements in 5G and the proliferation of the Internet of Things (IoT) also broaden the applicability of AI chips in edge computing scenariosIn particular, they are vital for applications like autonomous vehicles and smart cities, which require real-time AI inference at the endpoint
In response, numerous manufacturers have unveiled AI chips specifically tailored for edge inference.
In the manufacturing sector, locally running AI models can rapidly respond to data from sensors and cameras to execute critical tasksFor instance, automotive manufacturers use computer vision to inspect assembly lines, identifying potential defects before vehicles leave the factoryIn such applications, the demand for extremely low latency and continuous online performance renders it impractical to transmit data back and forth across the networkEven minimal delays can adversely affect product quality, while low-power devices often struggle to manage large AI workloads, such as those required for training computer vision systemsA holistic solution that combines the strengths of edge and cloud technology not only enhances scalability and processing power but also tightens the integration of data and analytics to reduce latency.
Take, for example, low-power Arduino devices; many of these are priced below $100 and can facilitate the operation of machine learning models across a range of devices, from a handful to thousands
A case in point is an agriculture company employing Arduino solutions to maximize crop yields by utilizing sensors that supply edge devices with data on soil moisture and weather conditions, determining the water required for cropsThis technology aids farmers in preventing overwatering while reducing the operating costs of electric pumps.
Another illustration can be seen in a manufacturer relying on precision lathes, utilizing sensors with Arduino devices to detect anomalies, such as subtle vibrations that can indicate impending equipment issuesMaintaining equipment regularly is significantly more cost-effective than dealing with unexpected failures that halt production.
These applications underscore the immense value of edge computing, highlighting a growing demand for intelligent control solutions
Currently, the market for such applications is expanding rapidly, and companies like Nvidia, with its high-performance chips focusing on cloud computing and data center AI servers, have paid little attention to the edge AI marketThis gap creates opportunities for AI chip companies like Tenstorrent.
As various players enter the space, competition within the AI chip market intensifies.
In recent years, the number of new startups in the global AI chip sector has surged, exceeding 80 by 2019 with total funding surpassing $3.5 billionPredictions suggest that by 2025, ASICs will account for 43% of the AI chip market, while GPUs will represent 29%, FPGAs 19%, and CPUs just 9%.
Among the burgeoning startups, Tenstorrent exemplifies a fresh approach; Cerebras Systems has developed the WSE (Wafer Scale Engine), the largest chip ever built, boasting an astounding 1.2 trillion transistors to bring unprecedented scale to AI computations
Another prominent company, Groq, founded by former Google engineers, is laser-focused on creating energy-efficient processors designed for AI inference.
Regarding Tenstorrent's technologies and products, its emphasis on low power makes its products conducive to edge AI applicationsAccording to Nikkei News, Tenstorrent is slated to announce the release of its second-generation AI processor by the end of 2024, although the name remains undisclosedThe roadmap released by the company for fall 2023 indicates plans to introduce the Black Hole independent AI processor and the Quasar low-power, cost-effective chip.
Years ago, while serving as CTO of Tenstorrent, Keller identified the potential of the low-power RISC-V architecture, leading his team to develop the Ascalon CPU
Reportedly, the upcoming Black Hole AI chip is based on SiFive’s X280 RISC-V core design, emphasizing efficiency at lower costsThe reason the next-generation processor is expected to achieve higher efficiency is largely attributed to its avoidance of high bandwidth memory (HBM), opting instead for GDDR6, which better suits entry-level AI processors designed for inference.
Even though Tenstorrent has yet to capture a significant share of the AI processor market, the company’s cost-effective, scalable solutions can cater to diverse application demands that Nvidia has yet to addressThe trend is clear: new entrants in the AI chip industry, including Tenstorrent, are focusing on unoccupied opportunities that the “green team” (a nod to Nvidia) has neglected.
Innovation in AI chips is on a continuous trajectory, not only enhancing computational capabilities but also optimizing architecture, power consumption, and integration
Advanced packaging technologies provide avenues for close integration of multiple AI chips, significantly boosting system bandwidth and energy efficiencyAI-specific memory technologies, including HBM and compressed memory, are expected to witness broader adoption.
Challenging Nvidia's ecosystem is equally crucial.
Beyond chip technology innovation, establishing a robust AI ecosystem is crucialNvidia's CUDA platform has evolved over the years into an extensive developer community, rich in software resources, which is a vital element supporting its competitive advantage.
In response, competitors are rushing to create ecosystems around their AI chips to attract developer support
Google has launched TensorFlow, building a deep learning framework based on TPUs, while AMD's acquisition of Xilinx strengthens its positionIntel has introduced the OneAPI development toolkit, endeavoring to unify programming interfaces across CPUs, GPUs, and AI accelerators.
Moreover, companies like Arm, Intel, Qualcomm, and Samsung have collaborated to establish the Unified Accelerator Foundation (UXL), one of its objectives being to replace Nvidia’s proprietary solutionsWithin AI systems, interconnect technology between chips is crucial, particularly data transmission bandwidth, which plays a key role in leveraging system performanceNvidia has been proactive in building its own ecosystem in this regard, using the newly released NVLink protocol for multiprocessor interconnects and network interconnectivityIn data center networks, Nvidia relies on its own InfiniBand bus.
Keller, a staunch advocate for openness in technology, finds Nvidia's closed ecosystem unappealing
He argues that Nvidia should transition from private protocols like NVLink to adopting open Ethernet standardsHe further contends that within data center networks, Nvidia should also switch from InfiniBand to Ethernet, suggesting that although InfiniBand offers low latency and high bandwidth (up to 200Gb/s), Ethernet can achieve speeds of 400Gb/s and even 800Gb/s.
In fact, giants such as AMD, Broadcom, Intel, Meta, Microsoft, and Oracle are collaborating on developing the next generation of ultra-fast Ethernet (Utlra Ethernet), which is better suited for AI and high-performance computing applications.
The big question remains: will emerging Ethernet technologies develop sufficiently to rival Nvidia's interconnect technology?
In July 2023, several industry leaders formed the Super Ethernet Alliance (Ultra Accelerator Link, UALink) to compete with Nvidia's InfiniBand.
AMD is committed to contributing broader Infinity Fabric shared memory protocols and GPU-specific xGMI to the UALink effort
All other participants in the initiative agree to use Infinity Fabric as the standard for accelerator interconnectSachin Katti, Intel's senior vice president and general manager of Networking and Edge Group, notes that the “promotion team” for UALink is considering employing Ethernet's first-layer transport protocols alongside Infinity Fabric to link GPU memory to a similarly large shared space as CPU-based NUMA architectures.
Members of the UALink Alliance believe that system manufacturers will create devices utilizing UALink, allowing clients customizing their Pods to incorporate accelerators from various contributorsThis setup facilitates versatility in server design and aligns with the open accelerator module (OAM) specifications released by Meta Platforms and Microsoft, ensuring compatibility of accelerator slots on system boards.
According to IDC, the deployment of 200Gb/s and 400Gb/s networks among hyperscale enterprises, cloud builders, HPC centers, and large corporations is significant, suggesting that both InfiniBand and Ethernet markets can coexist and expand.
Notably, Ethernet is pervasive across both edge devices and data centers, in stark contrast to InfiniBand, which is primarily utilized within data centers
IDC indicates that sales of data center Ethernet switches grew by 7.2% year-over-year in Q3 2023.
From the third quarter of 2022 to Q3 2023, the data center Ethernet switch market reached approximately $20 billionIf switching accounts for half of InfiniBand revenue, the Ethernet switch market still vastly eclipses that of InfiniBand by a factor of around seven, with an increasing number of AI clusters transitioning to Ethernet, thereby encroaching upon InfiniBand's market share.
Reports from IDC indicate that in the non-data center segment of the Ethernet switch market, sales growth is even faster, reflecting a 22.2% increase in Q3 2023, accumulating to a total growth of 36.5% in the first three quarters, as numerous companies upgrade their campus networks.
In Q3 2023, the combined market size of Ethernet switches for data center, campus, and edge operations reached $11.7 billion, notably increasing by 15.8% year-over-year
In contrast, the peripheral ethernet router market experienced a 9.4% decline, which is understandable as routers increasingly utilize commercial chips that integrate switching and routing functions.
At the data center level, revenue for 200Gb/s and 400Gb/s Ethernet switches rose by 44% year-over-year, with port shipments soaring by 63.9%. Additionally, sales of 100Gb/s Ethernet switches across the data center, edge, and campus sectors experienced growth of 6%.
In conclusion, while Nvidia has a clear advantage in the realm of cloud computing and AI systems within data centers, the road is less forgiving for both established giants and emerging startups wishing to challenge their dominanceNotably, smaller companies that previously attempted to enter this competitive landscape by replicating Nvidia's GPU-led approach have faced hardships, with some teetering on the brink of bankruptcy.
Given the formidable nature of cloud computing and data centers, focusing on low-power and cost-effective edge applications emerges as a strategic avenue with substantial market demand and growth potential, particularly when few suitable chips are available
Leave a Reply