MiniMax M2.5
MiniMax • 456B MoE • Q5_K_M
28.4
tokens/sec
Best all-rounder for agentic tasks. Fast despite size. Our default for Polly and Scout.
agentsgeneralcoding
Specifications
- Parameters
- 456B MoE
- Active Parameters
- 45B
- Quantization
- Q5_K_M
- VRAM Used
- 171GB
- Context Length
- 32,768 tokens
Performance
- Tokens/Second
- 28.4
- Time to First Token
- 1.2s
Benchmark Scores
coding 8.5/10
reasoning 8.8/10
creative 8.2/10
agents 9/10
context 8.7/10
Quick Setup
Ollama
ollama run minimax-m2.5 LM Studio
Search 'MiniMax M2.5' in Discover llama.cpp
llama-server -m MiniMax-M2.5-Q5_K_M.gguf -ngl 99