Mistral AI Models

Mistral AI produces highly efficient models that punch above their weight class.

Model Lineup

Open Models

Model	Parameters	Context	Architecture
Mistral 7B	7B	32K	Dense transformer
Mixtral 8x7B	47B (12B active)	32K	MoE
Mixtral 8x22B	141B (39B active)	64K	MoE

API-Only Models

Model	Best For
Mistral Large	Complex reasoning
Mistral Medium	Balanced performance
Mistral Small	Fast, cost-effective

Key Innovations

Mixture of Experts (MoE)

Uses 8 expert networks
Only 2 experts active per token
Near 8x7B quality at 7B inference cost

Sliding Window Attention

Efficient long-context handling
Reduced memory usage
Better than naive attention

Running Locally

With Ollama

# Standard Mistral
ollama pull mistral

# Mixtral
ollama pull mixtral:8x7b

# Run
ollama run mistral

Python Integration

import ollama

response = ollama.chat(
    model='mistral',
    messages=[
        {'role': 'user', 'content': 'Write a haiku about coding'}
    ]
)
print(response['message']['content'])

Mistral API

Setup

from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage

client = MistralClient(api_key="your-api-key")

response = client.chat(
    model="mistral-medium",
    messages=[ChatMessage(role="user", content="Hello!")]
)
print(response.choices[0].message.content)

Function Calling

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat(
    model="mistral-large-latest",
    messages=[ChatMessage(role="user", content="Weather in Paris?")],
    tools=tools
)

Comparison with Competitors

Aspect	Mistral 7B	Llama 2 7B	GPT-3.5
Code	Strong	Good	Strong
Reasoning	Good	Good	Strong
Speed	Fast	Fast	Medium
Cost	Free/Cheap	Free	Paid

Best Practices

Use MoE for quality - Mixtral gives excellent results
Leverage context - 32K window is generous
Prompt engineering - Works well with chain-of-thought
Temperature - Default 0.7 works well

intermediate LLM Comparison Updated 2024-12-18

mistral
mistral ai
mixtral
open source llm
moe

Mistral AI Models

Model Lineup

Open Models

API-Only Models

Key Innovations

Mixture of Experts (MoE)

Sliding Window Attention

Running Locally

With Ollama

Python Integration

Mistral API

Setup

Function Calling

Comparison with Competitors

Best Practices

Related Guides

Llama Models Guide

Running LLMs Locally

Claude: Capabilities & Best Practices