Meta unveils in-house AI chips to cut reliance on Nvidia hardware
Meta spent an estimated $10 billion on Nvidia hardware in 2024 alone. That number, cited in Meta's own capital expenditure disclosures, explains exactly why the company announced it is now building its own AI chips. Depending on a single supplier for the hardware that runs your entire AI operation is a cost problem, a supply chain problem, and a strategic vulnerability all at once.
The chips Meta unveiled are designed specifically for running large language models and the recommendation systems that power Facebook, Instagram, and WhatsApp at scale. They are not general-purpose processors. Meta's engineers built them around the exact workloads the company runs billions of times per day, which means the chips can skip functionality that Nvidia's H100 and H200 GPUs carry for customers with different needs.
What Meta's custom chips are actually designed to do
Meta has two distinct chip programs running in parallel. The first is an inference chip called MTIA, short for Meta Training and Inference Accelerator. The second generation of MTIA was detailed publicly in 2024 and targets the recommendation model workloads that consume the largest share of Meta's compute budget. These are not glamorous AI tasks, but they are expensive ones. Every time a user scrolls a feed, Meta's systems run hundreds of ranking decisions in milliseconds.
The second program covers training chips, which is where Nvidia has historically had the strongest grip. Training large language models like Llama requires sustained, high-bandwidth matrix computation over days or weeks. Meta has been less specific about the timeline and specs for its training silicon, but the direction is clear: reduce the number of Nvidia GPUs needed per training run without sacrificing model quality.
Why this strategy makes financial sense for Meta
Nvidia's H100 GPU costs between $25,000 and $40,000 per unit depending on the configuration and whether it is purchased directly or through a cloud provider. Meta buys these in clusters of tens of thousands. A custom chip that delivers equivalent performance on Meta's specific workloads at a lower unit cost pays for the chip design investment relatively quickly at that volume.
Google went through this same calculation with its Tensor Processing Units starting around 2016. By 2023, Google's TPUs were handling the majority of its internal AI workloads, and the company estimated the custom silicon had saved it billions in hardware costs over that period. Apple made a similar move with its M-series chips, taking control of silicon design to optimize performance per watt for its specific software stack. Meta is following the same logic, just later than its peers.
There is also a supply chain argument. During the GPU shortage of 2023, companies without allocation commitments from Nvidia faced multi-quarter delays on hardware orders. Meta had enough purchasing power to secure supply, but smaller teams within Meta's infrastructure organization still faced bottlenecks. Owning the chip design means Meta can contract directly with fabricators like TSMC and manage its own production schedule.
How this affects Nvidia's business
Meta joining the custom silicon club does not immediately threaten Nvidia's revenue. The transition from buying GPUs to running on proprietary chips takes years, and Meta will continue purchasing Nvidia hardware for workloads its custom chips cannot handle yet, particularly large training runs. Nvidia's data center revenue hit $47.5 billion in fiscal year 2024, and Meta accounts for a portion of that. Even a partial shift would be meaningful to Nvidia's numbers at that scale.
The longer-term concern for Nvidia is pattern repetition. Google, Amazon, Microsoft, and now Meta are all building custom AI silicon. Each one reduces the addressable market for Nvidia at the high-volume end of the customer base. Nvidia still dominates AI chip sales broadly, but the hyperscalers that once drove its growth are systematically reducing their dependence on external GPU suppliers.
Nvidia's response has been to move up the stack, offering software frameworks like CUDA and NIM that make it difficult to replace Nvidia hardware even when alternatives exist. Meta's internal chip program will need to build equivalent software tooling for its engineers, which is a non-trivial investment on top of the chip design itself.
The production and deployment timeline
Meta has not published a firm date for when its custom AI chips will carry a majority of its inference workload. The second-generation MTIA chip was in production deployment across Meta's data centers by late 2024 for recommendation tasks. The company has indicated it expects the third generation to be significantly more capable and to take on a broader range of AI jobs, with deployment targeted for 2026.
Meta's chip design teams are based primarily in Sunnyvale and Tel Aviv, drawing heavily from acquisitions and hires out of Qualcomm, Intel, and Apple's silicon groups. The Tel Aviv team in particular has expertise in neural network accelerator design that dates back to the acquisition of several Israeli AI hardware startups between 2019 and 2022.
Meta's capital expenditure guidance for 2025 sits between $60 billion and $65 billion, a substantial portion of which goes to AI infrastructure. Even if custom chips reduce the per-unit cost of compute over time, the near-term spending on chip development, fabrication, and data center integration means total hardware costs will remain high through at least 2026 before any savings materialize at scale.
AI Summary
Generate a summary with AI