by Junru Lu, Jiarui Qin, Lingfeng Qiao et al.
Youtu-LLM is a 1.96B parameter beast that punches way above its weight class
Think of it as a budget smartphone with flagship performance:
- Ultra-lightweight (1.96B params)
- STEM-specialized training![]()
- Native agentic superpowers (planning, reasoning, reflection)
- Long-context handling like a champ![]()
Dense Multi-Latent Attention โ smarter attention mechanism
STEM vocabulary โ talks math, physics, coding fluently
Progressive curriculum โ learns step-by-step like a grad student
![]()
Agentic mid-training โ learns to plan, reflect, execute autonomously
Result? Beats other sub-2B models across the board, especially in agentic tasks.
- Perfect for edge deployment (RPi, mobile, low-cost servers)
- STEM focus = GATE/CS prep + competitive programming gold![]()
- Agentic capabilities = real-world automation projects![]()
- Open-source code = fork, tweak, deploy immediately![]()
Clone & Run (Docker-ready):
git clone https://github.com/TencentCloudADP/youtu-tip
cd youtu-tip
pip install -r requirements.txt
python run_agentic_demo.py
Key Features in Repo:
Pre-trained 1.96B model weights
Agentic planning examples
Long-context reasoning demos
STEM benchmark scripts
Docker deployment configs
New SOTA for sub-2B LLMs
Beats larger models in agentic tasks
Competitive on general benchmarks
Runs fast on consumer GPUs
![]()
GitHub Repo โ https://github.com/TencentCloudADP/youtu-tip
Perfect for:
- GATE prep automation bots
- Competitive coding assistants
- Edge AI projects
- Lightweight RAG/agents
"Big brains don't need big GPUs anymore"
โ Nirmaan ML Forum, Jan 4, 2026![]()