When the Western AI conversation remains locked in debates over whether larger language models will eventually produce AGI, China is placing a very different bet. Beijing-based ShengShu Technology has closed a 2 billion yuan ($290 million) Series B round led by Alibaba Cloud to build what it calls a “general world model” — an AI system that processes vision, audio, and touch to simulate human-like perception of the physical world.
The round, announced April 10, also included participation from TAL Education and Baidu Ventures. It comes just two months after ShengShu raised 600 million yuan from Qiming Venture Partners. The startup declined to disclose its valuation, but the pace of fundraising — over $370 million in eight weeks — signals urgency and conviction.
Beyond Chatbots: Why World Models Matter
ShengShu’s thesis is straightforward: text-trained large language models have hit their limits. They can reason about language, but they can’t understand physical reality. A chatbot can describe what a kitchen looks like, but it can’t predict whether a dropped glass will shatter, or navigate a robot around a cluttered room.
“ShengShu believes that a general world model, built on multimodal data such as vision, audio, and touch, more naturally captures how the physical world works than large language models,” the three-year-old startup said in a statement.
Founder Zhu Jun, a Tsinghua University alumnus, described the goal as connecting “perception and action” — allowing AI systems to model and predict real-world behaviour consistently. This is the missing piece for robotics, autonomous driving, and any AI system that needs to interact with physical environments rather than just process text.
Vidu: From Video Generation to World Understanding
ShengShu isn’t starting from scratch. The company is best known for Vidu, an AI video generation tool that launched globally before OpenAI made its Sora tool widely available. Vidu Q3 Pro, released in January 2026, ranks among the top 10 AI models for generating video from text and images, according to Artificial Analysis.
Video generation might seem like a consumer product, but it’s a stepping stone. Generating realistic video requires understanding physics, motion, and spatial relationships — exactly the capabilities a world model needs. ShengShu is positioning Vidu’s underlying technology as the foundation for something far more ambitious: an AI that can simulate and predict real-world scenarios, not just create visual content.
Alibaba’s World Model Portfolio
Alibaba Cloud’s investment in ShengShu is not a one-off. The Chinese tech giant has been building a portfolio of world model investments:
- Tripo AI: $50 million round led by Alibaba and Baidu Ventures in March 2026, building 3D model generation and moving toward its own world model
- PixVerse: $60 million round led by Alibaba in September 2025, which released an AI world model in early 2026 allowing users to direct video in real-time
- ShengShu: $290 million Series B, the largest single investment
Alibaba has also released open-source AI models for video generation and, in February 2026, launched a model specifically for powering robots. The company is building infrastructure across the entire world model stack — from video generation to 3D modelling to embodied AI.
The Geopolitical Angle
ShengShu’s raise highlights an emerging pattern in the global AGI race: the United States and China are pursuing fundamentally different technical approaches.
The dominant US labs — OpenAI, Anthropic, Google DeepMind — have invested heavily in scaling language models, with world models emerging as a secondary research direction. DeepMind’s Demis Hassabis has said 2026 is the “breakthrough year” for world models and continual learning, and Yann LeCun launched AMI Labs with a $1.03 billion seed specifically to build world models. But these are research initiatives layered on top of massive LLM infrastructure.
China, by contrast, is going all-in on world models earlier. With fewer constraints on compute for LLM training due to US chip export controls, Chinese AI companies may see world models as a way to leapfrog — building AGI capabilities that don’t depend on having the largest language model. Alibaba’s systematic investment across multiple startups suggests a coordinated strategy, not scattered bets.
As Wired co-founder Kevin Kelly wrote recently: to replicate human intelligence, AI needs three things — reasoning, an understanding of the physical world, and continuous learning. LLMs have made progress on reasoning. World models address the second. The third remains unsolved. But if China can combine world models with its growing robotics industry, it may have a faster path to embodied AGI than the West’s text-first approach.
What It Means
ShengShu’s $290 million raise is not just another funding round. It’s a signal that the path to AGI is diverging:
- US approach: Scale language models → add world understanding → achieve AGI
- China approach: Build world models → add reasoning → achieve AGI
Neither path has reached its destination. But the investment patterns are clear: China is treating world models as a strategic priority, not a research side project. Alibaba’s $400 million+ in world model investments over eight months suggests a belief that embodied AI — robots, autonomous systems, and physical-world interaction — will define the next phase of the AGI race.
ShengShu has strategic partnerships with companies developing embodied AI for industrial, commercial, and home settings. The funding will accelerate those partnerships and the development of the general world model itself.
The question isn’t whether world models matter. DeepMind’s Hassabis and LeCun agree they do. The question is whether China’s early, coordinated bet on this approach gives it an advantage in the race to AGI — or whether the US labs’ head start in language reasoning proves harder to close than expected.
SOURCES
- CNBC — Alibaba leads $290m investment for Shengshu Vidu AI world model
- InForCapital — ShengShu Raises $293M for AGI Development
- Reuters — Chinese Startup Raises $293 Million to Tackle AGI with World Model
- TechBuzz — Alibaba Bets $290M on World Models as LLM Era Hits Limits
- Kevin Kelly, Wired co-founder — Substack analysis on world models and AGI