Nvidia’s Risk
Nvidia's overly concentrated revenue is at risk. Could a custom ASIC business help?
Generative AI is significantly driving up the demand for AI hardware, and Nvidia is capturing the lion’s share of sales. Consequently, Nvidia’s market capitalization is up over 500% in a span of just 16 months.
Yet Nvidia is not without risk. Their data center division, powered by sales of AI GPUs for training and inference, represented 80% of Q3 FY24 earnings. A significant majority of these sales come from a handful of large companies.
Nvidia carries notable revenue concentration risk.
Most investors overlook this risk, adhering to the belief that Nvidia is invincible. After all, Nvidia is a three-headed hydra; challengers must compete on GPU performance, software (CUDA), and networking/systems (Infiniband, DGX).
These arguments hold weight, especially when comparing AMD with Nvidia. How can AMD's ROCm to catch up to the comprehensive CUDA ecosystem that Nvidia has been cultivating since 2007? And even though AMD’s MI300X performance is fantastic, is it enough to overcome the switching costs of leaving Nvidia and CUDA?
Clearly, GPU competitors like AMD will struggle to catch up to Nvidia’s lead.
But it’s not AMD that Nvidia should be worried about.
Recall the fundamental lesson we learned studying Bitcoin hardware: for a given task, ASICs offer better performance with lower energy use than general-purpose processors. This realization transformed the market from GPUs to Bitcoin ASICs seemingly overnight.
Nvidia GPUs, despite the addition of tensor cores and high bandwidth memory, are general-purpose and thus burdened with unnecessary bloat. As AI ASICs become broadly available, we’ll see Nvidia customers migrate their inference and training workloads away from GPUs. Nvidia’s largest clients are poised to benefit the most from this transition, as their operational scale magnifies the cost and performance savings of custom silicon.
This is what Nvidia should worry about. Nvidia derives a considerable portion of their revenue from a small number of clients purchasing AI GPUs, which are inferior to AI ASICs. Worse, their best customers have the strongest motivation to transition to these ASICs. Additionally, the emergence of custom AI chips promises to erode Nvidia’s AI GPU margins.
Innovator’s dilemma?
There are already signs on the horizon of this shift away from GPUs.
Several of Nvidia’s largest data center clients are moving beyond GPUs by developing their own AI chips. Notable projects include Microsoft's Maia, Meta's MTIA, Google's TPU, and Amazon's Trainium and Inferentia.
Additionally, we see startups including Groq, Tenstorrent, Cerebras, and Etched bringing custom AI silicon to life.
The CUDA moat won’t prevent this disruption.
A gut reaction to AI ASICs is to reach for the CUDA trump card. How will AI silicon companies convince developers to switch away from CUDA? However, AI silicon doesn’t need to compete with the vast CUDA ecosystem.
Customers using H100s for inference aren’t utilizing the full breadth and depth of the CUDA ecosystem, but only the slice of CUDA needed for inference. AI silicon competitors, therefore, don't need to replicate or integrate with the broad ecosystem of CUDA libraries but can concentrate solely on enabling inference. This is a much smaller surface area to compete on.
One such approach is to enable libraries like TensorRT-LLM to compile down to the AI ASIC, permitting customers to lift and shift their existing workloads.
Nvidia sees disruption on the horizon and is taking action.
Nvidia understands their predicament and is countering with their own custom silicon business.
As announced last week by Reuters:
Nvidia is building a new business unit focused on designing bespoke chips for cloud computing firms and others, including advanced artificial intelligence (AI) processors, nine sources familiar with its plans told Reuters.
The dominant global designer and supplier of AI chips aims to capture a portion of an exploding market for custom AI chips and shield itself from the growing number of companies pursuing alternatives to its products.
The Santa Clara, California-based company controls about 80% of high-end AI chip market, a position that has sent its stock market value up 40% so far this year to $1.73 trillion after it more than tripled in 2023.
Nvidia's customers, which include ChatGPT creator OpenAI, Microsoft, Alphabet, and Meta have raced to snap up the dwindling supply of its chips to compete in the fast-emerging generative AI sector.
Its H100 and A100 chips serve as a generalized, all-purpose AI processor for many of those major customers. But the tech companies have started to develop their own internal chips for specific needs. Doing so helps reduce energy consumption, and potentially can shrink the cost and time to design.
Recognizing the threat of disruption, Nvidia’s entry into the custom silicon market is a strategic effort to retain its most significant customers. However, ASIC design is a service business, which is undoubtedly different from a merchant silicon business.
Adding a service business raises many questions.
Does Nvidia have the customer relationship management skills necessary for a service-based business?
Is their finance leadership capable of guiding a new division with a distinct business model and different investment and revenue timing?
Is Wall Street capable of appreciating the service business and its ramifications despite the possibility of lower margins?
How will the market view Nvidia’s custom ASICs eroding their AI GPU pricing power?
Do Nvidia’s customers actually want custom chips from Nvidia? Many of these companies are probably looking to decrease their reliance on a lone AI hardware supplier, especially one that has considerable pricing power and a notable backlog of orders. Why should these companies stay with NVIDIA rather than developing their own chip?
It will be fun to watch Nvidia’s custom silicon business take shape. I’m guessing they won’t actually design new chips, but instead simply modify or strip down the general purpose bloat from their existing AI GPUs – for example removing FP64 cores or adding more high bandwidth memory. Though this approach is easiest to execute and aligns with current incentives, it doesn't shield NVIDIA from the competitive risks posed by custom AI silicon.
Nvidia has time to figure things out, thanks in part to their access to leading-edge foundry capacity. Even if their biggest clients or startups design better AI chips, it’s unclear competitors can procure the leading-edge foundry capacity needed to quickly build significant quantities of these chips.
We’re left wondering - can Nvidia’s custom silicon business unit fend off the risk of disruption posed by AI silicon? I’m not so sure.
If you enjoyed this post, do me a favor and share it with a friend!