Model Comparison Chart
Sort by model size, VRAM minimum, or context window.
| Use Cases | ||||
|---|---|---|---|---|
| Llama 3.2 1B | 1B | 2GB | 128,000 | edge, rewrite, assistant-lite |
| Phi-3 Mini | 3.8B | 3GB | 4,000 | edge, chat, reasoning, low-latency |
| Llama 3.2 3B | 3B | 3GB | 128,000 | chat, summary, edge, tool-use |
| Mistral 7B | 7B | 5GB | 32,000 | chat, instruction-following, function-calling |
| DeepSeek Coder 6.7B | 6.7B | 5GB | 16,000 | coding, code-review, code-completion |
| Qwen 2.5 7B | 7B | 5GB | 32,000 | chat, reasoning, multilingual, json |
| Neural Chat 7B | 7B | 5GB | 32,000 | chat, assistant, customer-support |
| Yi 6B | 6B | 5GB | 4,096 | bilingual, chat, general |
| Llama 3.1 8B | 8B | 6GB | 128,000 | chat, general, instruction-following, rag |
| Qwen 2.5 Coder 7B | 7.6B | 6GB | 32,768 | coding, refactor, test-generation, agents |
| Gemma 2 9B | 9B | 7GB | 8,192 | chat, analysis, summarization |
| Mistral NeMo 12B | 12B | 8GB | 128,000 | chat, coding, reasoning, long-context |
| Solar 10.7B | 10.7B | 8GB | 4,096 | chat, single-turn, general |
| Phi-3 Medium | 14B | 10GB | 128,000 | reasoning, chat, analysis |
| Qwen 2.5 14B | 14B | 10GB | 32,000 | reasoning, agent, multilingual, json |
| Codestral 22B | 22B | 16GB | 32,768 | coding, fim, repository-analysis |
| Gemma 2 27B | 27B | 20GB | 8,192 | analysis, reasoning, enterprise-chat |
| DeepSeek Coder 33B | 33B | 22GB | 16,000 | coding, repository-analysis, test-generation |
| Yi 34B | 34B | 24GB | 4,096 | analysis, bilingual, knowledge |
| Llama 3.1 70B | 70B | 40GB | 128,000 | advanced-reasoning, agent, analysis, multilingual |