Claude Fable 5$22.000/MClaude Opus 4.8$11.000/MClaude Opus 4.7$11.000/MClaude Opus 4.6$11.000/MClaude Opus 4.5$33.000/MClaude Sonnet 3.7$6.600/MClaude Opus 3$33.000/MClaude 2.1$12.800/MClaude 2$12.800/MGPT-5.5$12.500/MGPT-5.2$5.425/MGPT-5.2-Codex$5.425/MGPT-5$3.875/MGPT-4.5$97.500/MGPT-4 Turbo Preview$16.000/MGPT-4$39.000/MGPT-4-32k$78.000/Mo3$19.000/Mo3-mini$2.090/Mo4-mini$2.090/Mo1$28.500/Mo1-mini$5.700/Mo1-preview$28.500/MGemini 3.5 Pro$5.000/MGemini 3.1 Pro$5.000/MGemini 3 Pro$5.000/MGemini 2.5 Pro$3.875/MGemini 1.5 Pro$2.375/MGemini 1.0 Ultra$12.000/MGemini 1.0 Pro$0.800/MClaude Fable 5$22.000/MClaude Opus 4.8$11.000/MClaude Opus 4.7$11.000/MClaude Opus 4.6$11.000/MClaude Opus 4.5$33.000/MClaude Sonnet 3.7$6.600/MClaude Opus 3$33.000/MClaude 2.1$12.800/MClaude 2$12.800/MGPT-5.5$12.500/MGPT-5.2$5.425/MGPT-5.2-Codex$5.425/MGPT-5$3.875/MGPT-4.5$97.500/MGPT-4 Turbo Preview$16.000/MGPT-4$39.000/MGPT-4-32k$78.000/Mo3$19.000/Mo3-mini$2.090/Mo4-mini$2.090/Mo1$28.500/Mo1-mini$5.700/Mo1-preview$28.500/MGemini 3.5 Pro$5.000/MGemini 3.1 Pro$5.000/MGemini 3 Pro$5.000/MGemini 2.5 Pro$3.875/MGemini 1.5 Pro$2.375/MGemini 1.0 Ultra$12.000/MGemini 1.0 Pro$0.800/M

DeepInfraEfficient

Llama 3.1 70B (DI)

Name: Llama 3.1 70B (DI)
Brand: DeepInfra
Price: 0.350000 USD

Balanced

The 2024 70B Llama that defined the open-weights chat baseline before Llama 3.3. Still common where hosts haven't upgraded yet.

Llama 3.1 70B (DI) is a efficient AI model from DeepInfra. It costs $0.350 per million input tokens and $0.400 per million output tokens (blended $0.365/M), with a 128K-token context window.

Profile inherited from upstream Llama 3.1 70B ↗ — this is a hosted variant of the same open-weights model.

Released Jul 2024Modalities textOfficial model page ↗Provider pricing ↗API docs ↗Compare with another model →Estimate monthly cost →

INPUT

$0.350/M

per million input tokens

OUTPUT

$0.400/M

per million output tokens

BLENDED 70/30

$0.365/M

default reference rate · how it's calculated →

CONTEXT

128K

128,000 tokens

What it's good at

Open weights
128K context
Wide hosted availability

Typical use cases

Self-hosted chat
Fine-tune base
Cost benchmarking

Benchmarks

vs. best public score

Scores inherited from Llama 3.1 70B — this is a hosted variant of the same open-weights model, so the underlying benchmark scores are identical.

MMLU83%

Multitask academic knowledge across 57 subjects.

GPQA Diamond47%

Graduate-level science questions, "Google-proof".

MATH68%

High-school competition math problems.

HumanEval80%

Python function synthesis from docstrings.

LMArena Elo1248 Elo

Crowd-sourced head-to-head preference Elo rating.

Hand-curated from each provider's published reports and public leaderboards. Methodology varies across sources — treat as directional rather than authoritative.

How much does Llama 3.1 70B (DI) cost?

Llama 3.1 70B (DI) costs $0.350 per million input tokens and $0.400 per million output tokens, for a blended reference rate of $0.365 per million tokens.

What is Llama 3.1 70B (DI)'s context window?

Llama 3.1 70B (DI) supports up to 128K tokens of context (128,000 tokens).

What is Llama 3.1 70B (DI) best for?

Llama 3.1 70B (DI) is well suited to Open weights, 128K context and Wide hosted availability.

Who makes Llama 3.1 70B (DI)?

Llama 3.1 70B (DI) is developed and served by DeepInfra. It was released in Jul 2024.