Google's TurboQuant AI compression tech triggers sell-off in memory chip stocks
Memory chip stocks dropped sharply after Bloomberg reported on Google's TurboQuant, a compression technology that reduces the amount of memory AI models need to run. Micron, SK Hynix, SanDisk, and Western Digital all saw their share prices fall as investors recalibrated how much hardware demand the AI build-out would actually generate if software could do more with less.
The reaction was fast. Micron fell several percentage points in a single session. SK Hynix, which had been riding high on demand for its HBM3E memory used in Nvidia's AI chips, also took a hit. The sell-off was concentrated specifically in companies whose revenue growth stories depend on AI continuing to consume ever-larger quantities of memory and storage.
What TurboQuant actually does
TurboQuant is a quantization-based compression method. Quantization in AI refers to reducing the numerical precision of a model's weights, typically from 32-bit or 16-bit floating point numbers down to 8-bit integers or lower. The tradeoff is usually accuracy. What Google is claiming with TurboQuant is that it can compress models more aggressively while holding quality loss to a minimum.
If a model that previously needed 80GB of GPU memory can be made to run in 40GB or less, that changes the hardware math considerably. It means fewer high-bandwidth memory chips per server, potentially smaller GPU clusters, and less spending on the memory components that companies like Micron and SK Hynix have been banking on.
Why the market reacted so strongly
Memory chip stocks have been priced for a specific future: one where AI model sizes keep growing and each generation of hardware needs more memory than the last. SK Hynix reported that HBM shipments accounted for a growing share of its revenue through 2024, and Micron had forecast strong data center demand through 2025. Both companies tied much of their near-term growth directly to AI infrastructure spending.
TurboQuant introduces doubt about that trajectory. If Google is deploying this internally and it works as described, other hyperscalers will study it closely. Microsoft, Amazon, and Meta all run massive model inference workloads. Even a 20% reduction in memory requirements across those workloads would translate into billions of dollars less in chip purchases annually.
This is not the first time a software efficiency gain has rattled hardware suppliers. When DeepSeek released its R1 model in January 2025, claiming competitive performance at a fraction of the training cost, Nvidia's stock dropped roughly 17% in a single day, wiping out nearly $600 billion in market capitalization. The pattern is becoming familiar: a technical disclosure changes the perceived hardware intensity of AI, and the market reprices accordingly.
The quantization space is getting crowded
Google is not working in isolation here. Hugging Face, Meta, and several academic groups have been publishing quantization research for years. GPTQ, AWQ, and GGUF are already widely used formats that let developers run compressed versions of large language models on consumer-grade hardware. What TurboQuant appears to offer is a more systematic, production-grade version of what the open-source community has been doing experimentally.
The difference between an experimental technique and one deployed by Google at scale is significant. When Google uses something across its search infrastructure, Gmail, and Cloud AI services, the volume is large enough to move supply chain numbers. That is what investors are pricing in.
How Micron and SK Hynix are positioned
Micron's fiscal Q2 2025 results, reported in March 2025, showed data center revenue growing to roughly $4.8 billion, up significantly year-over-year. The company had guided for continued strength in HBM and data center DRAM through the rest of the fiscal year. That guidance now faces a harder question from analysts about whether software compression tools will soften demand in the second half of 2025.
SK Hynix is in a similar spot. It holds a leading position in HBM3E supply for Nvidia's H100 and H200 GPUs. Its customers are not going to immediately redesign their server configurations because of one compression announcement, but if TurboQuant is adopted broadly over the next 12 to 18 months, procurement decisions for 2026 data center expansions could start shifting.
Western Digital and SanDisk are more exposed on the storage side. AI training generates enormous amounts of data that needs to be stored, and any reduction in the raw data volume associated with model training could affect flash storage demand as well. The sell-off in those names reflected that secondary concern.
What comes next for chip investors
The immediate market reaction may have been an overreaction. Compression techniques reduce memory per model, but the number of models being trained and deployed keeps increasing. If TurboQuant lets a data center run twice as many model instances on the same hardware, the net effect on memory demand could be neutral or even positive. That counterargument will take months of actual deployment data to prove out.
Micron is scheduled to report fiscal Q3 2025 earnings in June 2025. That report will be the first real data point for whether order patterns from hyperscalers have shifted in response to efficiency tools like TurboQuant. Until then, the stock will trade on uncertainty, and that uncertainty is exactly what the sell-off was pricing in.
AI Summary
Generate a summary with AI