Benchmark Results
All models tested on Mac Studio M2 Ultra with 512GB Unified. Last updated: 2026-04-04
| Model | Size | Quant | VRAM | tok/s | TTFT | Code | Reason | Creative | Agents | Context |
|---|---|---|---|---|---|---|---|---|---|---|
| MiniMax M2.5 MiniMax | 456B MoE | Q5_K_M | 171GB | 28.4 | 1.2s | 8.5 | 8.8 | 8.2 | 9 | 8.7 |
| Qwen3 Coder Next Alibaba | 32B | Q5_K_XL | 58GB | 42.1 | 0.8s | 9.2 | 8.4 | 7.5 | 8.6 | 8.3 |
| Gemma 4 31B Google | 31B | Q4_K_M | 20GB | 52.3 | 0.6s | 8.7 | 8.9 | 8.4 | 8.8 | 9.2 |
| Qwen2.5 Coder 14B Alibaba | 14B | Q8_0 | 16GB | 68.5 | 0.4s | 8.9 | 7.8 | 7.2 | 8 | 7.9 |
| Llama 3.3 70B Meta | 70B | Q4_K_M | 42GB | 24.2 | 1.8s | 8.3 | 8.6 | 8.8 | 8.2 | 8.5 |
tok/s = tokens per second (higher is faster)
TTFT = time to first token in seconds (lower is better)
Scores out of 10. Green = 8.5+