Llama Models Guide

Complete guide to Meta's Llama family of open-source language models.

Last updated: 2024-12-18

Llama Models

Meta's Llama family represents the leading open-source large language models.

Llama 3.x Series

Available Sizes

Model	Parameters	Context	Best For
Llama 3.1 8B	8B	128K	Fast inference, edge deployment
Llama 3.1 70B	70B	128K	General purpose, high quality
Llama 3.1 405B	405B	128K	Research, maximum capability

Key Features

128K context window - Handle long documents
Multilingual - 8 languages supported
Code generation - Strong programming ability
Tool use - Native function calling

Running Llama Locally

With Ollama

# Pull model
ollama pull llama3.1:8b

# Run interactively
ollama run llama3.1:8b

# Use specific variant
ollama run llama3.1:70b

With llama.cpp

# Clone and build
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make

# Convert weights
python convert.py --outfile llama-3.1-8b.gguf

# Run
./main -m llama-3.1-8b.gguf -p "Hello, world"

API Usage

Python with Together AI

from openai import OpenAI

client = OpenAI(
    api_key="your-together-api-key",
    base_url="https://api.together.xyz/v1"
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct-Turbo",
    messages=[
        {"role": "user", "content": "Explain quantum computing"}
    ]
)
print(response.choices[0].message.content)

With Groq (Fast Inference)

from groq import Groq

client = Groq(api_key="your-groq-api-key")

response = client.chat.completions.create(
    model="llama-3.1-70b-versatile",
    messages=[{"role": "user", "content": "Hello!"}]
)

Llama 3.2 Multimodal

# Vision model usage
from ollama import chat

response = chat(
    model='llama3.2-vision',
    messages=[{
        'role': 'user',
        'content': 'Describe this image',
        'images': ['./image.jpg']
    }]
)

Best Practices

Practice	Recommendation
Quantization	Use Q4_K_M for balance
Context	Start with 4K, expand as needed
System prompt	Be explicit about format
Temperature	0.1-0.3 for factual, 0.7+ creative

intermediate LLM Comparison Updated 2024-12-18

llama
llama 3
meta ai
open source llm
local ai