NLPTranslation

HY-MT1.5-1.8B: Tencent Translation Model Beats Google Translate by 15-65%

Tencent releases WMT2025 winner distilled to 1.8B parameters. The model runs on smartphones with just 1GB RAM and supports 33 languages including underserved ones like Tibetan, Uyghur, Kazakh, and Mongolian.

Key Takeaways

  • -WMT2025 Winner Lineage: Distilled from the 7B model that won 30 of 31 language pairs at WMT2025
  • -Edge Deployable: 1GB RAM after quantization, 0.18s inference for 50 tokens
  • -Outperforms Google: 15-65% improvement across WMT25 evaluation categories
  • -Broad Language Support: 33 languages plus 5 Chinese dialects, including underserved languages

Technical Specifications

SpecificationValue
Parameters1.8B
Tensor TypeBF16
Context Length2048 tokens
Pretraining Tokens1.3T
RAM (Quantized)1GB
Inference Speed0.18s / 50 tokens
Languages Supported33 + 5 Chinese dialects
Teacher ModelHunyuan-MT-7B (WMT25 winner)
Distillation MethodOn-policy distillation
Release DateDecember 30, 2025

Benchmark Results

Hunyuan-MT demonstrates state-of-the-art performance across major translation benchmarks, with the 7B teacher model winning 30 of 31 language pairs at WMT2025.

BenchmarkHunyuan-MT-7BGoogle TranslateGemini-2.5-Pro
WMT24pp EN-XX0.8585-0.8250
FLORES-2000.8758--
WMT25 Ranking1st (30/31)LowerLower

Note: The 1.8B distilled model achieves approximately 90% of Gemini-3.0-Pro performance on FLORES-200, making it the most capable edge-deployable translation model available.

Language Coverage

HY-MT1.5 supports 33 major world languages plus 5 Chinese dialects, with notable strength in underserved languages often neglected by commercial translation services.

Underserved Languages

Strong performance on languages with limited training data:

  • Tibetan
  • Uyghur
  • Kazakh
  • Mongolian

Chinese Dialects

Dedicated support for major Chinese regional varieties:

  • Cantonese
  • Shanghainese (Wu)
  • Hokkien (Min Nan)
  • Hakka
  • Teochew

Edge Deployment Capabilities

The defining feature of HY-MT1.5-1.8B is its ability to run entirely on-device, enabling private, offline translation without cloud dependencies.

1GB
RAM after quantization
0.18s
Per 50 tokens inference
2x
Faster than commercial APIs

Deployment Targets

  • -Smartphones: iOS and Android devices with 2GB+ RAM (most devices from 2020 onward)
  • -Embedded Systems: Raspberry Pi 4/5, Jetson Nano, and similar edge hardware
  • -Offline Applications: Travel apps, fieldwork tools, privacy-focused translation
  • -Browser Extensions: WebAssembly deployment for in-browser translation

Competitive Analysis: HY-MT vs. Alternatives

vs. Google Translate

HY-MT Advantages

  • - 15-65% higher quality across WMT25 categories
  • - Fully offline operation (no API dependency)
  • - Open weights for customization
  • - No usage limits or API costs
  • - Privacy: data never leaves device

Google Advantages

  • - 100+ languages vs. 33
  • - Established ecosystem integration
  • - Real-time camera translation
  • - Document translation features

vs. DeepL

HY-MT Advantages

  • - Edge deployment (DeepL is cloud-only)
  • - Open source (DeepL is proprietary)
  • - Asian language strength (Chinese dialects)
  • - Zero recurring costs

DeepL Advantages

  • - Superior European language pairs
  • - Document formatting preservation
  • - Glossary and terminology tools
  • - Enterprise API with SLAs

vs. Meta NLLB-200

HY-MT Advantages

  • - Higher benchmark scores on major pairs
  • - Smaller model size (1.8B vs 3.3B distilled)
  • - Faster inference speed
  • - Better Chinese dialect coverage

NLLB Advantages

  • - 200 languages vs. 33
  • - Better low-resource African languages
  • - Larger research community
  • - More deployment tutorials available

Recommendations

Mobile/Offline Translation

HY-MT1.5-1.8B is the recommended choice for mobile and offline translation applications. The 1GB quantized footprint and 0.18s inference make it viable for real-time translation on modern smartphones without network connectivity.

Best for: Travel apps, field research tools, privacy-focused messaging, areas with limited connectivity.

High-Volume Asian Language Pairs

For applications requiring Chinese, Japanese, Korean, or Southeast Asian language translation at scale, HY-MT offers superior quality at zero marginal cost compared to commercial APIs.

Best for: E-commerce localization, content platforms, customer support automation.

When to Use Alternatives

Consider Google Translate or DeepL for: European language pairs (especially German/French), document translation with formatting, or when you need 100+ languages. Use Meta NLLB for low-resource African languages.

Key limitation: HY-MT supports 33 languages. If your use case requires broader coverage, NLLB-200 may be more appropriate despite lower benchmark scores.

How On-Policy Distillation Works

The 1.8B model is distilled from the 7B WMT25 winner using on-policy distillation, a technique that preserves more of the teacher model's capabilities than traditional knowledge distillation.

  1. 1
    Teacher Generation: The 7B model generates translation samples for the training corpus, creating high-quality target distributions.
  2. 2
    On-Policy Sampling: The student model generates its own translations, which are then compared against teacher outputs rather than fixed targets.
  3. 3
    Distribution Matching: The student learns to match the full output distribution of the teacher, not just argmax predictions, preserving translation variety.
  4. 4
    Iterative Refinement: Multiple rounds of distillation with curriculum learning, starting from easier language pairs.

Related Content

Conclusion

Tencent's HY-MT1.5-1.8B represents a significant milestone in democratizing high-quality machine translation. By distilling their WMT2025-winning 7B model into an edge-deployable package, they have created a translation system that outperforms Google Translate on major benchmarks while running entirely on-device.

The model's particular strength in underserved languages (Tibetan, Uyghur, Kazakh, Mongolian) and Chinese dialects fills an important gap in the translation landscape. For developers building offline-capable translation applications, this is now the state-of-the-art choice.

The main limitation remains language coverage: 33 languages versus Google's 100+ or NLLB's 200. However, for the languages it does support, HY-MT1.5 delivers exceptional quality at unprecedented efficiency.