Benchmarks Qwen2.5 Coder 14B

Qwen2.5 Coder 14B

Alibaba • 14B • Q8_0
68.5
tokens/sec

Fast and efficient for coding tasks. Fits on 32GB Macs easily.

codingautocompletesmall-context

Specifications

Parameters
14B
Quantization
Q8_0
VRAM Used
16GB
Context Length
32,768 tokens

Performance

Tokens/Second
68.5
Time to First Token
0.4s

Benchmark Scores

coding 8.9/10
reasoning 7.8/10
creative 7.2/10
agents 8/10
context 7.9/10

Quick Setup

Ollama
ollama run qwen2.5-coder:14b
MLX
mlx_lm.generate --model mlx-community/Qwen2.5-Coder-14B-8bit
llama.cpp
llama-server -m qwen2.5-coder-14b-Q8_0.gguf -ngl 99