Hardware Guide
What can you run on your Mac? Here's what fits at each memory tier.
Our Test Machine
- Model
- Mac Studio M2 Ultra
- Memory
- 512GB Unified
- Chip
- M2 Ultra (24-core CPU, 76-core GPU)
- OS
- macOS 15.4
What Fits Your Mac?
32GB Unified Memory
Entry LevelMacBook Air M3, MacBook Pro base configs. Good for smaller models.
Recommended Models
- Qwen2.5 Coder 14B Q8 — 16GB, best coding option at this tier
- Gemma 4 31B Q4 — 20GB, squeezes in with some headroom
- Llama 3.1 8B — Lightweight, fast, general purpose
64GB Unified Memory
Sweet SpotMacBook Pro upgraded, Mac Mini M4 Pro. The productivity sweet spot.
Recommended Models
- Llama 3.3 70B Q4 — 42GB, excellent all-rounder
- Qwen3 Coder Next Q5 — 58GB, top coding model
- DeepSeek Coder V2 — Strong reasoning + code
128GB Unified Memory
Pro TierMac Studio M2 Max, MacBook Pro maxed. Run most models comfortably.
Recommended Models
- Llama 3.3 70B Q6 — Higher quality quant
- Mixtral 8x22B — MoE architecture, fast inference
- Multiple 30B models — Run 2-3 models concurrently
192GB–512GB Unified Memory
Ultra TierMac Studio M2 Ultra / M4 Ultra. Run anything, including massive MoE models.
What Opens Up
- MiniMax M2.5 — 171GB, 456B MoE, API-quality local
- DeepSeek V3 — Massive reasoning model
- 3+ models hot — Route between specialized models
- Full precision 70B — No quantization needed
Performance Tips
🎯 Leave Headroom
Don't use 100% of your RAM for the model. Leave 10-20% for system and context. If you have 64GB, target models under 50GB loaded.
⚡ GPU Layers
Always use -ngl 99 or equivalent to offload
all layers to GPU. Apple Silicon's unified memory makes this seamless.
📊 Quantization Trade-offs
Q4 saves memory but loses quality. Q6/Q8 is closer to full precision. For coding, higher quants often matter more than for chat.
🔄 Hot Swapping
With enough RAM, keep multiple models loaded and route between them. No cold start penalty — instant switching.
Buying Advice
If you're buying a Mac for local LLMs, memory is the most important spec. You can't upgrade it later.
- • Casual use: 32GB works, but you'll hit limits quickly
- • Serious work: 64GB minimum, 128GB recommended
- • Production/multi-model: 192GB+ (Mac Studio territory)
The difference between a frustrating experience and a great one is often just RAM.