A Chinese chip startup just unveiled a TPU that's 3x faster than its predecessor — and it's built entirely without foreign IP

There’s a quiet but meaningful upgrade happening in China’s homegrown AI chip scene. Zhonghao Xinying, a domestic AI chip company, announced a new fully self-developed TPU called Xuyu on Tuesday, alongside its Taize 2.0 computing platform.

The Xuyu chip delivers 896 TFLOPS of mixed-precision floating-point performance — three times what its predecessor, Chana, could manage. For 8-bit inference workloads, it hits 1,792 TOPS, which puts it in contention for high-concurrency token-heavy deployments. Memory bandwidth and inter-chip interconnect speeds have both been significantly upgraded, and the chip supports ultra-long context windows.

Power draw sits at 600W per card. That’s about half what conventional compute chips consume, according to the company, and positions the chip for lower-carbon data center builds.

What makes Xuyu stand out is its stack. The IP cores, instruction set, operator acceleration libraries, and system software are all developed in-house with zero dependency on foreign core technology. That matters for Chinese government, financial, and grid customers who face strict security and compliance requirements.

The Taize 2.0 platform is the smallest standard compute unit in Zhonghao Xinying’s lineup. It pairs dual high-performance CPUs with eight TPU processing units — physically laid out as one general-purpose CPU server connected to one TPU accelerator device. Total compute hits 7.168 petaflops (mixed precision), and power consumption for equivalent workloads is 80% of what a traditional GPU server would draw.

On the software side, Taize 2.0 supports all major AI frameworks natively. PyTorch, vLLM, and SGLang work out of the box. Training pipelines can tap DeepSpeed and Megatron-LM. The platform has already been deeply adapted for dozens of large language and multimodal models including the Qwen series, DeepSeek, GLM, and MiniMAX, so developers can migrate models without rebuilding from scratch.

The announcement signals that China’s domestic AI chip ecosystem is not just about survival — it’s building production-grade alternatives with competitive specs and full software stacks. Xuyu ships into a market that urgently needs compute capacity that doesn’t come with geopolitical strings attached.