Why Chinese AI Models Should Worry Nvidia, Micron Stock Investors

June 29th, 2026 by Trefis Team

+3.96%

Upside

193

Market

200

Trefis

NVDA

NVIDIA

Chinese models are quietly challenging the $600B+ AI infrastructure supercycle. Markets have glossed over it, but they probably shouldn’t.

The AI infrastructure buildout has been one of the most aggressive capital deployment cycles in tech history. Microsoft (MSFT), Google (GOOG), Amazon (AMZN), and Meta (META) are collectively spending over $600 billion on AI infrastructure in 2026 alone, with Nvidia projecting at least $1 trillion in cumulative demand for its Blackwell and next-generation Rubin GPU systems through 2027. The beneficiaries have been obvious: Nvidia (NVDA) has dominated with its H100, H200, and now Blackwell chips; Broadcom (AVGO) has become the go-to for custom AI silicon; and Micron (MU) has seen HBM memory demand surge as AI models require increasingly large memory bandwidth. The narrative has been clean and compelling: more compute equals better models equals a competitive moat, and whoever assembles the most AI firepower wins.

The Quiet Disruption

Over the past week or so, both the New York Times and the Wall Street Journal have run major pieces on a wave of Chinese AI models that are challenging this assumption at its root. The WSJ reported that Zhipu AI’s new model, GLM-5.2, released under its Z.ai brand, has matched Anthropic’s powerful Mythos model in certain cybersecurity benchmarks, specifically in finding software security bugs.^[1]

Relevant Articles

Cybersecurity is arguably one of the most critical national security domains in the AI race, given that the ability to autonomously find and exploit software vulnerabilities at scale is effectively a cyberwarfare capability. Security researchers at Semgrep found GLM-5.2 bested Anthropic’s Claude Opus 4.8 in some tests. GLM-5.2 has already ranked among the 10 most-used AI models globally, according to OpenRouter. Six of the models currently on that leaderboard were developed in China. Markets largely ignored these stories. That may be a mistake.

Efficiency As A Structural Advantage

The cost gap is not marginal. GLM-5.2 costs approximately one-eighth as much as Anthropic’s Claude Opus 4.8 for certain workloads, according to OpenRouter. That difference is not simply a pricing decision. It also reflects advances in model architecture. Chinese AI labs, led by DeepSeek’s breakthrough in early 2025, have demonstrated how architectural innovations such as Mixture of Experts (MoE), Multi-Head Latent Attention (MLA), and FP8 mixed-precision computing can dramatically improve AI efficiency. Rather than activating every parameter for every query, MoE models route each task to only the subset of “experts” needed to answer it. MLA reduces the memory footprint of long-context inference, while FP8 lowers HBM memory and compute requirements with minimal loss in accuracy. Together, these techniques reduce the amount of computation and memory bandwidth required per token, lowering inference costs.

DeepSeek’s breakthrough last year showed that highly capable frontier models could be trained at a fraction of previously assumed costs, although the exact figures remain debated. The broader point is clear: algorithmic improvements are beginning to substitute for brute-force compute.

The Risk To Earnings, Not Just Multiples

The AI trade has largely been built on one assumption: the only way to build better AI is to throw exponentially more compute at the problem. Chinese labs are beginning to demonstrate a different path, particularly in inference, where better algorithms can partially substitute for more hardware.

Bulls could argue that Nvidia at roughly 15x next year’s earnings and MU at around 9x already price in meaningful risk. That framing misses the point. The issue is not the multiple applied to current earnings; it is whether those earnings are themselves sustainable. Nvidia’s data center revenue has scaled to levels that assume compute remains the binding constraint in AI development. Micron’s HBM pricing power rests on insatiable memory demand from dense model architectures. If MoE and MLA adoption continues to compress per-query memory requirements, and if open-weight Chinese models become the default inference layer for cost-conscious enterprises globally, the volume assumptions underpinning these elevated earnings become fragile. U.S. companies will adapt. Distribution is not the bottleneck either. These models are open-weight and already available on Microsoft Azure and Amazon Web Services. Nvidia will likely remain dominant in training, where raw compute still matters enormously. U.S. labs are already absorbing Chinese efficiency techniques. But the inference market, which Jensen Huang himself called the dominant AI workload of 2026, is where the pressure will be felt first and most acutely.

There are, of course, caveats. Chinese labs accumulated substantial Nvidia GPU inventories before export controls tightened, while others have reportedly accessed compute through overseas data centers. There is also evidence that some Chinese developers have relied on model distillation, a technique in which a new model learns from the outputs of an existing one. Anthropic, for example, has accused Alibaba of using thousands of fraudulent accounts to access its models. None of this, however, takes away from the fact that these models are becoming both more capable and more compute-efficient. Distillation alone cannot create a frontier model, and access to GPUs is no substitute for architectural innovation. The algorithmic advances are real, and investors would be mistaken to dismiss them.

The Investment Implications

Historically, the warning signs for a capex supercycle come not from demand collapsing but from the underlying efficiency math quietly shifting. Chinese AI labs, forced to operate under chip export controls, have optimized their way into a cost structure that challenges one of the core assumptions underpinning the AI infrastructure buildout. The $600B+ spending wave is not going to zero. Agentic AI demand is real, and inference at scale still requires serious hardware. The question for investors is no longer whether AI demand will continue growing. It almost certainly will. The question is whether that growth will require as much compute and memory as today’s infrastructure winners are priced to deliver.

If the economics of AI infrastructure are changing, investors may need to rethink not just individual stocks but portfolio construction itself. Knowing which names to hold, and how much, is exactly the work the Trefis methodology does for you. The Trefis High Quality (HQ) Portfolio weighs the full picture of quality across thousands of names rather than any single opportunity, owns the 30 strongest, and sizes and re-balances them with rules. It has a track record of outpacing a benchmark that combines the three major indices – the S&P 500, S&P Mid-cap, and Russell 2000.

Notes:

China Has Matched Anthropic in Cybersecurity, Resetting AI Race, WSJ [↩]

Relevant Articles

EXPLORE TREFIS