China’s Z.AI Releases GLM-5.2: A Mannequin That Rivals Claude Opus—Utilizing Zero Nvidia Chips

In short

GLM-5.2 trails Claude Opus 4.8 by simply 1% on FrontierSWE—a benchmark measuring multi-hour autonomous engineering initiatives—whereas beating GPT-5.5 on the identical check. It ships below an MIT license with zero regional restrictions.
The mannequin was constructed fully on Huawei Ascend chips with no NVIDIA {hardware} concerned.
Unsloth AI already launched 2-bit GGUF quantizations that shrink the mannequin from 1.51TB to 238GB. You may nonetheless want 256GB of RAM or VRAM—however at that time, you’ll be able to run it.

Z.ai dropped GLM-5.2 on June 16, promising high degree performances, beating its already superior GLM 5.1.

The Beijing-based lab, which has been on the U.S. Entity Listing since January 2025, seems to be benefiting from rising considerations over America’s method to AI. Over the previous week, the ban on Anthropic Fable and the discharge of this new mannequin have helped drive zAI’s refill 90%, sending it to a brand new all-time excessive.

GLM 5.2 has the numbers to again up the hype.

On FrontierSWE—a benchmark that evaluates whether or not an AI agent can full open-ended technical initiatives measured in hours, overlaying techniques optimization, large-scale code building, and utilized ML analysis, scored by dominance price—GLM-5.2 hit 74.4 in opposition to Claude Opus 4.8’s 75.1. It edged out GPT-5.5 at 72.6. On SWE-bench Professional, which assessments autonomous decision of real-world GitHub points scored as a cross price, GLM-5.2 scored 62.1 to GPT-5.5’s 58.6—and cleared its predecessor GLM-5.1’s 58.4 by a large margin.

The standard bounce makes it the very best open-source mannequin up to now within the Synthetic Evaluation Intelligence Index, which aggregates the outcomes of 9 completely different scores to evaluate the final high quality of an AI mannequin. OpenRouter’s benchmarks put it in the identical class because the now banned Claude Fable 5.

The {hardware} used to attain this feat is one other attention-grabbing a part of the story. GLM-5.2 was skilled on Huawei Ascend chips—no Nvidia wherever within the pipeline. Emad Mostaque, founding father of Stability AI, estimated whole coaching prices at round $25 million, 80% of that in post-training, which might make it extraordinarily low-cost compared in opposition to its friends.

As Decrypt reported earlier this yr, Z.ai was already coaching picture fashions on Huawei’s Ascend Atlas servers with no single American chip. GLM-5.2 takes that infrastructure additional—a 744-billion-parameter mixture-of-experts mannequin with a real 1 million-token context window, 5 instances the 200K restrict on GLM-5.1, and an MIT license meaning no authorities directive can flip the entry change.

Tokens are the chunks of tet a mannequin can learn and generate whereas Parameters are the variety of inside settings and values that decide how a mannequin processes info and generates responses

Who it is for and what it prices

For builders, the context window is the operational shift. Entire-repo navigation, multi-file refactors, and lengthy agentic pipelines that beforehand required chunking grow to be single-call workflows. API pricing runs $1.40 per million enter tokens and $4.40 per million output—in opposition to Claude Opus 4.8’s $5 enter and $25 output. The Coding Plan begins at round $18 a month and works straight inside Claude Code, Cline, Kilo Code, and hottest agentic environments.

Native deployment can also be technically attainable. Unsloth AI pushed 2-bit GGUF quantizations that compress the mannequin from 1.51TB all the way down to 238GB whereas retaining ~82% accuracy.

Don’t get too excited, although. That also means it calls for 256GB of unified reminiscence or an identical RAM/VRAM combo—a maxed M4 Extremely Mac Studio or a workstation with a mid-range GPU and 256GB of system RAM with mixture-of-experts offloading. It’s nonetheless some huge cash, however no less than one thing that you would be able to purchase and run on your own home if you happen to actually need to.

We ran a fast check, asking GLM-5.2 to construct our normal sport mixing typing mechanics with a shooter. The UI wasn’t the prettiest—different fashions generated extra polished-looking interfaces, however the expertise was probably the most diverse: completely different eventualities throughout waves, enemy varieties that shifted, bosses showing later within the run.

It generated extra numerous sport states than the rest we examined for a similar process in a zero shot setup.

If you wish to play it, it’s stay in our Itch.io profile.

That variance factors towards the place GLM-5.2 makes probably the most financial sense. For multi-shot technology workflows and agentic pipelines the place output variety issues greater than polish, the maths at open-source pricing ranges is tough to argue with. For the toughest sustained duties—SWE-Marathon, the place it scores 13.0 in opposition to Opus 4.8’s 26.0—the hole to the closed frontier continues to be actual, and 13 factors large.

Open-source weights are stay on HuggingFace below the MIT license. The quantized weights are additionally obtainable on HuggingFace. GLM Coding Plan subscribers can change now with the mannequin string GLM-5.2, and it’s additionally obtainable at no cost testing on z.AI with some utilization constraints.