Skip to content

Replicate Models

Replicate provides 16 AI models accessible via API.

Visit Replicate →

16

Models Available

$0.050

Cheapest Input / 1M

164K

Largest Context

What is Replicate?

Replicate is an AI model provider offering 16 large language models for developers. Their cheapest model starts at $0.050 per 1M input tokens, and their largest context window reaches 164K. Replicate provides 16 AI models accessible via API.

Replicate Strengths

All Replicate Models

Model Input $/1M Output $/1M Context Max Output Released
Meta/Llama 2 7b $0.050 $0.25 4K 4,096
Meta/Llama 2 7b Chat $0.050 $0.25 4K 4,096
Meta/Llama 3 8b $0.050 $0.25 8K 8,086
Meta/Llama 3 8b Instruct $0.050 $0.25 8K 8,086
Mistralai/Mistral 7b Instruct V0.2 $0.050 $0.25 4K 4,096
Mistralai/Mistral 7b V0.1 $0.050 $0.25 4K 4,096
Meta/Llama 2 13b $0.10 $0.50 4K 4,096
Meta/Llama 2 13b Chat $0.10 $0.50 4K 4,096
Mistralai/Mixtral 8x7b Instruct V0.1 $0.30 $1.00 4K 4,096
Meta/Llama 2 70b $0.65 $2.75 4K 4,096
Meta/Llama 2 70b Chat $0.65 $2.75 4K 4,096
Meta/Llama 3 70b $0.65 $2.75 8K 8,192
Meta/Llama 3 70b Instruct $0.65 $2.75 8K 8,192
Deepseek Ai/Deepseek V3.1 $0.67 $2.02 164K 163,840
Deepseek Ai/Deepseek $1.45 $1.45 66K 8,192
Deepseek Ai/Deepseek R1 $3.75 $10.00 66K 8,192

Model Details

Meta/Llama 2 7b

Meta/Llama 2 7b is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.0500/1M input tokens, $0.2500/1M output tokens.

Input: $0.050/1M Output: $0.25/1M Context: 4K
text

Meta/Llama 2 7b Chat

Meta/Llama 2 7b Chat is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.0500/1M input tokens, $0.2500/1M output tokens.

Input: $0.050/1M Output: $0.25/1M Context: 4K
text

Meta/Llama 3 8b

Meta/Llama 3 8b is available via Replicate with a 8K context window and up to 8,086 output tokens. Pricing: $0.0500/1M input tokens, $0.2500/1M output tokens.

Input: $0.050/1M Output: $0.25/1M Context: 8K
text

Meta/Llama 3 8b Instruct

Meta/Llama 3 8b Instruct is available via Replicate with a 8K context window and up to 8,086 output tokens. Pricing: $0.0500/1M input tokens, $0.2500/1M output tokens.

Input: $0.050/1M Output: $0.25/1M Context: 8K
text

Mistralai/Mistral 7b Instruct V0.2

Mistralai/Mistral 7b Instruct V0.2 is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.0500/1M input tokens, $0.2500/1M output tokens.

Input: $0.050/1M Output: $0.25/1M Context: 4K
text

Mistralai/Mistral 7b V0.1

Mistralai/Mistral 7b V0.1 is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.0500/1M input tokens, $0.2500/1M output tokens.

Input: $0.050/1M Output: $0.25/1M Context: 4K
text

Meta/Llama 2 13b

Meta/Llama 2 13b is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.1000/1M input tokens, $0.5000/1M output tokens.

Input: $0.10/1M Output: $0.50/1M Context: 4K
text

Meta/Llama 2 13b Chat

Meta/Llama 2 13b Chat is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.1000/1M input tokens, $0.5000/1M output tokens.

Input: $0.10/1M Output: $0.50/1M Context: 4K
text

Mistralai/Mixtral 8x7b Instruct V0.1

Mistralai/Mixtral 8x7b Instruct V0.1 is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.3000/1M input tokens, $1.00/1M output tokens.

Input: $0.30/1M Output: $1.00/1M Context: 4K
text

Meta/Llama 2 70b

Meta/Llama 2 70b is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.6500/1M input tokens, $2.75/1M output tokens.

Input: $0.65/1M Output: $2.75/1M Context: 4K
text

Meta/Llama 2 70b Chat

Meta/Llama 2 70b Chat is available via Replicate with a 4K context window and up to 4,096 output tokens. Pricing: $0.6500/1M input tokens, $2.75/1M output tokens.

Input: $0.65/1M Output: $2.75/1M Context: 4K
text

Meta/Llama 3 70b

Meta/Llama 3 70b is available via Replicate with a 8K context window and up to 8,192 output tokens. Pricing: $0.6500/1M input tokens, $2.75/1M output tokens.

Input: $0.65/1M Output: $2.75/1M Context: 8K
text

Meta/Llama 3 70b Instruct

Meta/Llama 3 70b Instruct is available via Replicate with a 8K context window and up to 8,192 output tokens. Pricing: $0.6500/1M input tokens, $2.75/1M output tokens.

Input: $0.65/1M Output: $2.75/1M Context: 8K
text

Deepseek Ai/Deepseek V3.1

Deepseek Ai/Deepseek V3.1 is available via Replicate with a 164K context window and up to 163,840 output tokens. Pricing: $0.6720/1M input tokens, $2.02/1M output tokens.

Input: $0.67/1M Output: $2.02/1M Context: 164K
text function calling reasoning

Deepseek Ai/Deepseek

Deepseek Ai/Deepseek is available via Replicate with a 66K context window and up to 8,192 output tokens. Pricing: $1.45/1M input tokens, $1.45/1M output tokens.

Input: $1.45/1M Output: $1.45/1M Context: 66K
text function calling

Deepseek Ai/Deepseek R1

Deepseek Ai/Deepseek R1 is available via Replicate with a 66K context window and up to 8,192 output tokens. Pricing: $3.75/1M input tokens, $10.00/1M output tokens.

Input: $3.75/1M Output: $10.00/1M Context: 66K
text reasoning

Compare Replicate model pricing

Use our pricing calculator to find the cheapest Replicate model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →