Tencent is expanding its presence in the open-source AI arena with a new lineup of Hunyuan AI models, engineered for both versatility and performance. The models, now available on Hugging Face, come in sizes ranging from 0.5B to 7B parameters, making them suitable for everything from edge devices to high-concurrency production environments.

This release builds on the foundation laid by Tencent’s larger Hunyuan-A13B, with the new models inheriting much of its architecture and training methodology to ensure consistency in performance across varying scales.

Designed for Flexibility, Powered by Performance
The Hunyuan models are instruction-tuned and pre-trained, supporting use cases that span chatbots, content generation, and document analysis. One standout feature is the models’ native 256K context window, enabling smooth handling of long documents and extended dialogues—ideal for legal, academic, or enterprise applications.
Tencent also emphasizes hybrid reasoning: users can switch between fast-response or deep-analysis modes depending on their specific requirements, adding a layer of adaptability rarely seen in open-source LLMs.
Leading the Pack in Agentic Capabilities
The models have been optimized for agent-based AI tasks, performing strongly in benchmarks such as:
- C3-Bench: Hunyuan-7B-Instruct scores 68.5
- BFCL-v3 and τ-Bench: consistent leadership performance
These results signal strong multi-step reasoning, making Hunyuan suitable for tasks involving planning, decision-making, or logic-heavy workflows.
High Efficiency Through Advanced Quantisation
To reduce the resource load for deployment, Tencent has introduced AngleSlim, a proprietary compression tool supporting two advanced quantisation methods:
- FP8 Static Quantisation: Uses minimal calibration data to scale model weights into an 8-bit floating-point format—ideal for improving inference speed without retraining.
- INT4 Quantisation: Uses GPTQ and AWQ algorithms for compact 4-bit weight formats, optimizing speed and memory while preserving accuracy. Pre-quantised models are available for instant use.
Benchmark Highlights
The raw performance of the Hunyuan models stands out across academic and practical domains:
| Model | Benchmark | Score |
|---|---|---|
| Hunyuan-7B | MMLU | 79.82 |
| GSM8K | 88.25 | |
| MATH | 74.85 | |
| Hunyuan-7B-Instruct | AIME 2024 | 81.1 |
| OlympiadBench | 76.5 | |
| Livecodebench | 42.0 | |
| DROP (B16 / FP8 / INT4) | 85.9 / 86.0 / 85.7 |
These figures confirm that the models maintain high reasoning and mathematical proficiency even when compressed for efficiency
🚀We're expanding the Tencent Hunyuan open-source LLM ecosystem with four compact models (0.5B, 1.8B, 4B, 7B)! Designed for low-power scenarios like consumer-grade GPUs, smart vehicles, smart home devices, mobile phones, and PCs, these models support cost-effective fine-tuning… pic.twitter.com/CknskVqPem
— Hunyuan (@TencentHunyuan) August 4, 2025
Seamless Integration into Modern AI Workflows
Tencent recommends deploying the models using TensorRT-LLM, vLLM, or SGLang, all of which support OpenAI-compatible APIs. This makes it easier for developers to plug Hunyuan into existing systems or develop new services without heavy infrastructure changes.