1. Alibaba’s Qwen3.7-Max Ran Autonomously for 35 Hours to Optimise Its Own Chip’s Code
May 21-23, 2026 | VentureBeat / The Decoder / Alibaba Cloud
Alibaba’s Qwen team released Qwen3.7-Max, a proprietary model that in a real-world test ran a fully autonomous kernel optimisation for 35 hours straight. The model compiled, measured, and revised code in loops, catching compilation errors and tracking down performance bottlenecks on its own.
- The result: 10x average speedup over the reference implementation on T-Head-ZW-M890 accelerators (Alibaba’s own AI chip platform)
- The autonomy: 432 kernel tests with 1,158 total tool calls — zero human intervention
- Comparison: GLM 5.1 hit 7.3x. Kimi K2.6 got 5x. DeepSeek V4 Pro managed 3.3x. Qwen3.6-Plus barely moved at 1.1x.
- KernelBench L3: Qwen3.7-Max claims accelerated kernels 96% of the time, just behind Claude Opus 4.6
- Interop: Supports OpenAI- and Anthropic-compatible interfaces. Plugs into Claude Code and Qwen Code.
- NZ angle: Open-weight frontier models like Qwen are viable for NZ organisations that self-host — but this one is API-only, not open source.
Why it matters: An AI model that can autonomously optimise code for a chip architecture it’s never seen during training is a qualitative leap. The “AI that improves its own infrastructure” loop just got a real-world demonstration.
2. DeepSeek Makes 75% Discount Permanent
DeepSeek’s promotional 75% discount on V4 Pro is now permanent. The move signals that Chinese AI pricing isn’t a land grab — it’s sustainable at Chinese infrastructure cost structures. Western labs running on Nvidia hardware can’t match it without destroying margins.
- The pricing: V4 Pro at ~25% of GPT-5.5/Claude Opus 4 pricing, permanently
- The moat: Every month a developer stays on DeepSeek is a month they’re not building OpenAI into their stack
- NZ angle: Frontier AI at 75% off makes adoption viable for NZ SMEs — but data sovereignty requires self-hosting
Read the full article.
3. AutoTTS: Claude Code Discovers Scaling Algorithms Humans Wouldn’t Design
Researchers from UMD, UVA, WUSTL, UNC, Google, and Meta used Claude Code to search for test-time scaling algorithms. The discovered algorithm tracks confidence shifts across rounds, slashes token usage ~70% vs self-consistency, and maintains accuracy. Total cost: $40. 160 minutes.
- The method: Claude Code autonomously searched the space of possible test-time compute scaling strategies
- The result: An algorithm that tracks confidence shifts and reduces token usage by ~70%
- The quote: Logic “humans probably wouldn’t design by hand”
- The signal: AI isn’t just running algorithms — it’s discovering new ones
4. AI Washing Hits Absurd Levels
The Guardian reports UK PR executives estimate ~50% of AI-related pitches they receive are ones they don’t want to send. AllBirds pivoted from shoes to AI GPUs. Standard Chartered’s CEO apologised for calling workers “lower-value human capital.” AI-powered basketball hoops and predator-protection lasers were cited.
- The signal-to-noise ratio: Half of AI pitches are unpitchable
- The real damage: When everything is “AI-powered,” genuine AI innovation gets harder to identify
- The AllBirds pivot: If a shoe company pivoting to GPUs doesn’t define peak AI washing, nothing does