AI Export Controls Fail Their First Real Test: GLM-5.2 Cybersecurity Benchmarks Expose the Gap
AI Export Controls Fail Their First Real Test: GLM-5.2 Cybersecurity Benchmarks Expose the Gap
Publish Date: 2026-06-28 16:12:00
Source Domain: www.techtimes.com
Two independent security evaluations published this week delivered a verdict that Washington’s export control architects did not want to hear: Zhipu AI’s GLM-5.2, a Chinese open-weight model that launched June 13 — one day after the US government banned Claude Fable 5 and Mythos 5 from global access — has matched or approached leading US AI on the exact class of cybersecurity capability that justified the ban. The model is freely downloadable by anyone on Earth, and no export order can reach it.
The finding matters because the enforcement architecture behind the Fable 5 ban was designed for a different era of controlled technology. The Export Administration Regulations were built to track physical items — chips, weapons components, hardware with serial numbers, facilities subject to on-site inspection. A 750-billion-parameter model weight file hosted on Hugging Face has no serial number, no facility, and no provenance chain. The enforcement mechanism that made export controls effective for semiconductors is architecturally inapplicable to AI weights once distributed. That gap is now empirically visible in the benchmark scores.
The Benchmark Findings That Changed the Argument
Semgrep, a security firm that evaluates AI models for vulnerability detection, ran GLM-5.2 against a set of open-source models on its IDOR detection benchmark — the same dataset and prompt it has used to evaluate frontier coding agents. An Insecure Direct Object Reference is an access control flaw in which a web application exposes an internal identifier — a user ID, a database key, a file name — without verifying whether the requesting user is authorized to access it. It has ranked among the most frequently exploited web security vulnerabilities for years and is harder to detect than typical code flaws because it requires recognizing a missing check rather than a dangerous function call.
GLM-5.2 scored a 39% F1 on IDOR detection, beating Claude Code models, which ranged from 28% to…