Mistral AI Models
Mistral AI produces highly efficient models that punch above their weight class.
Model Lineup
Open Models
| Model |
Parameters |
Context |
Architecture |
| Mistral 7B |
7B |
32K |
Dense transformer |
| Mixtral 8x7B |
47B (12B active) |
32K |
MoE |
| Mixtral 8x22B |
141B (39B active) |
64K |
MoE |
API-Only Models
| Model |
Best For |
| Mistral Large |
Complex reasoning |
| Mistral Medium |
Balanced performance |
| Mistral Small |
Fast, cost-effective |
Key Innovations
Mixture of Experts (MoE)
- Uses 8 expert networks
- Only 2 experts active per token
- Near 8x7B quality at 7B inference cost
Sliding Window Attention
- Efficient long-context handling
- Reduced memory usage
- Better than naive attention
Running Locally
With Ollama
# Standard Mistral
ollama pull mistral
# Mixtral
ollama pull mixtral:8x7b
# Run
ollama run mistral
Python Integration
import ollama
response = ollama.chat(
model='mistral',
messages=[
{'role': 'user', 'content': 'Write a haiku about coding'}
]
)
print(response['message']['content'])
Mistral API
Setup
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
client = MistralClient(api_key="your-api-key")
response = client.chat(
model="mistral-medium",
messages=[ChatMessage(role="user", content="Hello!")]
)
print(response.choices[0].message.content)
Function Calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
response = client.chat(
model="mistral-large-latest",
messages=[ChatMessage(role="user", content="Weather in Paris?")],
tools=tools
)
Comparison with Competitors
| Aspect |
Mistral 7B |
Llama 2 7B |
GPT-3.5 |
| Code |
Strong |
Good |
Strong |
| Reasoning |
Good |
Good |
Strong |
| Speed |
Fast |
Fast |
Medium |
| Cost |
Free/Cheap |
Free |
Paid |
Best Practices
- Use MoE for quality - Mixtral gives excellent results
- Leverage context - 32K window is generous
- Prompt engineering - Works well with chain-of-thought
- Temperature - Default 0.7 works well
intermediate | LLM Comparison | Updated 2024-12-18
- mistral
- mistral ai
- mixtral
- open source llm
- moe