Benchmark Results

All models tested on Mac Studio M2 Ultra with 512GB Unified. Last updated: 2026-04-04

Model Size Quant VRAM tok/s TTFT Code Reason Creative Agents Context
MiniMax M2.5
MiniMax
456B MoE Q5_K_M 171GB 28.4 1.2s 8.5 8.8 8.2 9 8.7
Qwen3 Coder Next
Alibaba
32B Q5_K_XL 58GB 42.1 0.8s 9.2 8.4 7.5 8.6 8.3
Gemma 4 31B
Google
31B Q4_K_M 20GB 52.3 0.6s 8.7 8.9 8.4 8.8 9.2
Qwen2.5 Coder 14B
Alibaba
14B Q8_0 16GB 68.5 0.4s 8.9 7.8 7.2 8 7.9
Llama 3.3 70B
Meta
70B Q4_K_M 42GB 24.2 1.8s 8.3 8.6 8.8 8.2 8.5

tok/s = tokens per second (higher is faster)

TTFT = time to first token in seconds (lower is better)

Scores out of 10. Green = 8.5+