Skip to content

DeepInfra Models

DeepInfra provides 67 AI models accessible via API.

Visit DeepInfra →

67

Models Available

$0.020

Cheapest Input / 1M

1.0M

Largest Context

What is DeepInfra?

DeepInfra is an AI model provider offering 67 large language models for developers. Their cheapest model starts at $0.020 per 1M input tokens, and their largest context window reaches 1.0M. DeepInfra provides 67 AI models accessible via API.

DeepInfra Strengths

All DeepInfra Models

Model Input $/1M Output $/1M Context Max Output Released
Meta Llama/Llama 3.2 3B Instruct $0.020 $0.020 131K 131,072
Meta Llama/Meta Llama 3.1 8B Instruct Turbo $0.020 $0.030 131K 131,072
Mistralai/Mistral Nemo Instruct 2407 $0.020 $0.040 131K 131,072
Meta Llama/Meta Llama 3 8B Instruct $0.030 $0.060 8K 8,192
Meta Llama/Meta Llama 3.1 8B Instruct $0.030 $0.050 131K 131,072
Qwen/Qwen2.5 7B Instruct $0.040 $0.10 33K 32,768
Sao10K/L3 8B Lunaris V1 Turbo $0.040 $0.050 8K 8,192
Google/Gemma 3 4b It $0.040 $0.080 131K 131,072
Nvidia/NVIDIA Nemotron Nano 9B $0.040 $0.16 131K 131,072
Openai/Gpt Oss 20b $0.040 $0.15 131K 131,072
Meta Llama/Llama 3.2 11B Vision Instruct $0.049 $0.049 131K 131,072
Google/Gemma 3 12b It $0.050 $0.10 131K 131,072
Mistralai/Mistral Small 24B Instruct 2501 $0.050 $0.080 33K 32,768
Openai/Gpt Oss 120b $0.050 $0.45 131K 131,072
Meta Llama/Llama Guard 3 8B $0.055 $0.055 131K 131,072
Qwen/Qwen3 14B $0.060 $0.24 41K 40,960
Microsoft/Phi 4 $0.070 $0.14 16K 16,384
Mistralai/Mistral Small 3.2 24B Instruct 2506 $0.075 $0.20 128K 128,000
Gryphe/MythoMax L2 13b $0.080 $0.090 4K 4,096
Qwen/Qwen3 30B A3B $0.080 $0.29 41K 40,960
Meta Llama/Llama 4 Scout 17B 16E Instruct $0.080 $0.30 328K 327,680
Qwen/Qwen3 235B A22B Instruct 2507 $0.090 $0.60 262K 262,144
Google/Gemma 3 27b It $0.090 $0.16 131K 131,072
Qwen/Qwen3 32B $0.10 $0.28 41K 40,960
Google/Gemini 2.0 Flash 001 $0.10 $0.40 1M 1,000,000
Meta Llama/Meta Llama 3.1 70B Instruct Turbo $0.10 $0.28 131K 131,072
Nvidia/Llama 3.3 Nemotron Super 49B V1.5 $0.10 $0.40 131K 131,072
Qwen/Qwen2.5 72B Instruct $0.12 $0.39 33K 32,768
Meta Llama/Llama 3.3 70B Instruct Turbo $0.13 $0.39 131K 131,072
Qwen/Qwen3 Next 80B A3B Instruct $0.14 $1.40 262K 262,144
Qwen/Qwen3 Next 80B A3B Thinking $0.14 $1.40 262K 262,144
Qwen/QwQ 32B $0.15 $0.40 131K 131,072
Meta Llama/Llama 4 Maverick 17B 128E Instruct FP8 $0.15 $0.60 1.0M 1,048,576
Qwen/Qwen3 235B A22B $0.18 $0.54 41K 40,960
Meta Llama/Llama Guard 4 12B $0.18 $0.18 164K 163,840
Qwen/Qwen2.5 VL 32B Instruct $0.20 $0.60 128K 128,000
Deepseek Ai/DeepSeek R1 Distill Llama 70B $0.20 $0.60 131K 131,072
Meta Llama/Llama 3.3 70B Instruct $0.23 $0.40 131K 131,072
Deepseek Ai/DeepSeek V3 0324 $0.25 $0.88 164K 163,840
Allenai/OlmOCR 7B 0725 FP8 $0.27 $1.50 16K 16,384
Deepseek Ai/DeepSeek R1 Distill Qwen 32B $0.27 $0.27 131K 131,072
Deepseek Ai/DeepSeek V3.1 $0.27 $1.00 164K 163,840
Deepseek Ai/DeepSeek V3.1 Terminus $0.27 $1.00 164K 163,840
Qwen/Qwen3 Coder 480B A35B Instruct Turbo $0.29 $1.20 262K 262,144
NousResearch/Hermes 3 Llama 3.1 70B $0.30 $0.30 131K 131,072
Qwen/Qwen3 235B A22B Thinking 2507 $0.30 $2.90 262K 262,144
Google/Gemini 2.5 Flash $0.30 $2.50 1M 1,000,000
Deepseek Ai/DeepSeek V3 $0.38 $0.89 164K 163,840
Qwen/Qwen3 Coder 480B A35B Instruct $0.40 $1.60 262K 262,144
Meta Llama/Meta Llama 3.1 70B Instruct $0.40 $0.40 131K 131,072
Mistralai/Mixtral 8x7B Instruct V0.1 $0.40 $0.40 33K 32,768
Zai Org/GLM 4.5 $0.40 $1.60 131K 131,072
Microsoft/WizardLM 2 8x22B $0.48 $0.48 66K 65,536
Deepseek Ai/DeepSeek R1 0528 $0.50 $2.15 164K 163,840
Moonshotai/Kimi K2 Instruct $0.50 $2.00 131K 131,072
Moonshotai/Kimi K2 Instruct 0905 $0.50 $2.00 262K 262,144
Nvidia/Llama 3.1 Nemotron 70B Instruct $0.60 $0.60 131K 131,072
Sao10K/L3.1 70B Euryale V2.2 $0.65 $0.75 131K 131,072
Sao10K/L3.3 70B Euryale V2.3 $0.65 $0.75 131K 131,072
Deepseek Ai/DeepSeek R1 $0.70 $2.40 164K 163,840
NousResearch/Hermes 3 Llama 3.1 405B $1.00 $1.00 131K 131,072
Deepseek Ai/DeepSeek R1 0528 Turbo $1.00 $3.00 33K 32,768
Deepseek Ai/DeepSeek R1 Turbo $1.00 $3.00 41K 40,960
Google/Gemini 2.5 Pro $1.25 $10.00 1M 1,000,000
Anthropic/Claude 3 7 Sonnet Latest $3.30 $16.50 200K 200,000
Anthropic/Claude 4 Sonnet $3.30 $16.50 200K 200,000
Anthropic/Claude 4 Opus $16.50 $82.50 200K 200,000

Model Details

Meta Llama/Llama 3.2 3B Instruct

Meta Llama/Llama 3.2 3B Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0200/1M output tokens.

Input: $0.020/1M Output: $0.020/1M Context: 131K
text function calling

Meta Llama/Meta Llama 3.1 8B Instruct Turbo

Meta Llama/Meta Llama 3.1 8B Instruct Turbo is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0300/1M output tokens.

Input: $0.020/1M Output: $0.030/1M Context: 131K
text function calling

Mistralai/Mistral Nemo Instruct 2407

Mistralai/Mistral Nemo Instruct 2407 is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0400/1M output tokens.

Input: $0.020/1M Output: $0.040/1M Context: 131K
text function calling

Meta Llama/Meta Llama 3 8B Instruct

Meta Llama/Meta Llama 3 8B Instruct is available via DeepInfra with a 8K context window and up to 8,192 output tokens. Pricing: $0.0300/1M input tokens, $0.0600/1M output tokens.

Input: $0.030/1M Output: $0.060/1M Context: 8K
text function calling

Meta Llama/Meta Llama 3.1 8B Instruct

Meta Llama/Meta Llama 3.1 8B Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0300/1M input tokens, $0.0500/1M output tokens.

Input: $0.030/1M Output: $0.050/1M Context: 131K
text function calling

Qwen/Qwen2.5 7B Instruct

Qwen/Qwen2.5 7B Instruct is available via DeepInfra with a 33K context window and up to 32,768 output tokens. Pricing: $0.0400/1M input tokens, $0.1000/1M output tokens.

Input: $0.040/1M Output: $0.10/1M Context: 33K
text

Sao10K/L3 8B Lunaris V1 Turbo

Sao10K/L3 8B Lunaris V1 Turbo is available via DeepInfra with a 8K context window and up to 8,192 output tokens. Pricing: $0.0400/1M input tokens, $0.0500/1M output tokens.

Input: $0.040/1M Output: $0.050/1M Context: 8K
text

Google/Gemma 3 4b It

Google/Gemma 3 4b It is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0400/1M input tokens, $0.0800/1M output tokens.

Input: $0.040/1M Output: $0.080/1M Context: 131K
text function calling

Nvidia/NVIDIA Nemotron Nano 9B

Nvidia/NVIDIA Nemotron Nano 9B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0400/1M input tokens, $0.1600/1M output tokens.

Input: $0.040/1M Output: $0.16/1M Context: 131K
text function calling

Openai/Gpt Oss 20b

Openai/Gpt Oss 20b is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0400/1M input tokens, $0.1500/1M output tokens.

Input: $0.040/1M Output: $0.15/1M Context: 131K
text function calling

Meta Llama/Llama 3.2 11B Vision Instruct

Meta Llama/Llama 3.2 11B Vision Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0490/1M input tokens, $0.0490/1M output tokens.

Input: $0.049/1M Output: $0.049/1M Context: 131K
text

Google/Gemma 3 12b It

Google/Gemma 3 12b It is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0500/1M input tokens, $0.1000/1M output tokens.

Input: $0.050/1M Output: $0.10/1M Context: 131K
text function calling

Mistralai/Mistral Small 24B Instruct 2501

Mistralai/Mistral Small 24B Instruct 2501 is available via DeepInfra with a 33K context window and up to 32,768 output tokens. Pricing: $0.0500/1M input tokens, $0.0800/1M output tokens.

Input: $0.050/1M Output: $0.080/1M Context: 33K
text function calling

Openai/Gpt Oss 120b

Openai/Gpt Oss 120b is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0500/1M input tokens, $0.4500/1M output tokens.

Input: $0.050/1M Output: $0.45/1M Context: 131K
text function calling

Meta Llama/Llama Guard 3 8B

Meta Llama/Llama Guard 3 8B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0550/1M input tokens, $0.0550/1M output tokens.

Input: $0.055/1M Output: $0.055/1M Context: 131K
text

Qwen/Qwen3 14B

Qwen/Qwen3 14B is available via DeepInfra with a 41K context window and up to 40,960 output tokens. Pricing: $0.0600/1M input tokens, $0.2400/1M output tokens.

Input: $0.060/1M Output: $0.24/1M Context: 41K
text function calling

Microsoft/Phi 4

Microsoft/Phi 4 is available via DeepInfra with a 16K context window and up to 16,384 output tokens. Pricing: $0.0700/1M input tokens, $0.1400/1M output tokens.

Input: $0.070/1M Output: $0.14/1M Context: 16K
text function calling

Mistralai/Mistral Small 3.2 24B Instruct 2506

Mistralai/Mistral Small 3.2 24B Instruct 2506 is available via DeepInfra with a 128K context window and up to 128,000 output tokens. Pricing: $0.0750/1M input tokens, $0.2000/1M output tokens.

Input: $0.075/1M Output: $0.20/1M Context: 128K
text function calling

Gryphe/MythoMax L2 13b

Gryphe/MythoMax L2 13b is available via DeepInfra with a 4K context window and up to 4,096 output tokens. Pricing: $0.0800/1M input tokens, $0.0900/1M output tokens.

Input: $0.080/1M Output: $0.090/1M Context: 4K
text function calling

Qwen/Qwen3 30B A3B

Qwen/Qwen3 30B A3B is available via DeepInfra with a 41K context window and up to 40,960 output tokens. Pricing: $0.0800/1M input tokens, $0.2900/1M output tokens.

Input: $0.080/1M Output: $0.29/1M Context: 41K
text function calling

Meta Llama/Llama 4 Scout 17B 16E Instruct

Meta Llama/Llama 4 Scout 17B 16E Instruct is available via DeepInfra with a 328K context window and up to 327,680 output tokens. Pricing: $0.0800/1M input tokens, $0.3000/1M output tokens.

Input: $0.080/1M Output: $0.30/1M Context: 328K
text function calling

Qwen/Qwen3 235B A22B Instruct 2507

Qwen/Qwen3 235B A22B Instruct 2507 is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.0900/1M input tokens, $0.6000/1M output tokens.

Input: $0.090/1M Output: $0.60/1M Context: 262K
text function calling

Google/Gemma 3 27b It

Google/Gemma 3 27b It is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0900/1M input tokens, $0.1600/1M output tokens.

Input: $0.090/1M Output: $0.16/1M Context: 131K
text function calling

Qwen/Qwen3 32B

Qwen/Qwen3 32B is available via DeepInfra with a 41K context window and up to 40,960 output tokens. Pricing: $0.1000/1M input tokens, $0.2800/1M output tokens.

Input: $0.10/1M Output: $0.28/1M Context: 41K
text function calling

Google/Gemini 2.0 Flash 001

Google/Gemini 2.0 Flash 001 is available via DeepInfra with a 1M context window and up to 1,000,000 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 1M
text function calling

Meta Llama/Meta Llama 3.1 70B Instruct Turbo

Meta Llama/Meta Llama 3.1 70B Instruct Turbo is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.1000/1M input tokens, $0.2800/1M output tokens.

Input: $0.10/1M Output: $0.28/1M Context: 131K
text function calling

Nvidia/Llama 3.3 Nemotron Super 49B V1.5

Nvidia/Llama 3.3 Nemotron Super 49B V1.5 is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 131K
text function calling

Qwen/Qwen2.5 72B Instruct

Qwen/Qwen2.5 72B Instruct is available via DeepInfra with a 33K context window and up to 32,768 output tokens. Pricing: $0.1200/1M input tokens, $0.3900/1M output tokens.

Input: $0.12/1M Output: $0.39/1M Context: 33K
text function calling

Meta Llama/Llama 3.3 70B Instruct Turbo

Meta Llama/Llama 3.3 70B Instruct Turbo is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.3900/1M output tokens.

Input: $0.13/1M Output: $0.39/1M Context: 131K
text function calling

Qwen/Qwen3 Next 80B A3B Instruct

Qwen/Qwen3 Next 80B A3B Instruct is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.1400/1M input tokens, $1.40/1M output tokens.

Input: $0.14/1M Output: $1.40/1M Context: 262K
text function calling

Qwen/Qwen3 Next 80B A3B Thinking

Qwen/Qwen3 Next 80B A3B Thinking is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.1400/1M input tokens, $1.40/1M output tokens.

Input: $0.14/1M Output: $1.40/1M Context: 262K
text function calling

Qwen/QwQ 32B

Qwen/QwQ 32B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.1500/1M input tokens, $0.4000/1M output tokens.

Input: $0.15/1M Output: $0.40/1M Context: 131K
text function calling

Meta Llama/Llama 4 Maverick 17B 128E Instruct FP8

Meta Llama/Llama 4 Maverick 17B 128E Instruct FP8 is available via DeepInfra with a 1.0M context window and up to 1,048,576 output tokens. Pricing: $0.1500/1M input tokens, $0.6000/1M output tokens.

Input: $0.15/1M Output: $0.60/1M Context: 1.0M
text function calling

Qwen/Qwen3 235B A22B

Qwen/Qwen3 235B A22B is available via DeepInfra with a 41K context window and up to 40,960 output tokens. Pricing: $0.1800/1M input tokens, $0.5400/1M output tokens.

Input: $0.18/1M Output: $0.54/1M Context: 41K
text function calling

Meta Llama/Llama Guard 4 12B

Meta Llama/Llama Guard 4 12B is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.1800/1M input tokens, $0.1800/1M output tokens.

Input: $0.18/1M Output: $0.18/1M Context: 164K
text

Qwen/Qwen2.5 VL 32B Instruct

Qwen/Qwen2.5 VL 32B Instruct is available via DeepInfra with a 128K context window and up to 128,000 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 128K
text vision function calling

Deepseek Ai/DeepSeek R1 Distill Llama 70B

Deepseek Ai/DeepSeek R1 Distill Llama 70B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 131K
text

Meta Llama/Llama 3.3 70B Instruct

Meta Llama/Llama 3.3 70B Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.2300/1M input tokens, $0.4000/1M output tokens.

Input: $0.23/1M Output: $0.40/1M Context: 131K
text function calling

Deepseek Ai/DeepSeek V3 0324

Deepseek Ai/DeepSeek V3 0324 is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.2500/1M input tokens, $0.8800/1M output tokens.

Input: $0.25/1M Output: $0.88/1M Context: 164K
text function calling

Allenai/OlmOCR 7B 0725 FP8

Allenai/OlmOCR 7B 0725 FP8 is available via DeepInfra with a 16K context window and up to 16,384 output tokens. Pricing: $0.2700/1M input tokens, $1.50/1M output tokens.

Input: $0.27/1M Output: $1.50/1M Context: 16K
text

Deepseek Ai/DeepSeek R1 Distill Qwen 32B

Deepseek Ai/DeepSeek R1 Distill Qwen 32B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.2700/1M input tokens, $0.2700/1M output tokens.

Input: $0.27/1M Output: $0.27/1M Context: 131K
text function calling

Deepseek Ai/DeepSeek V3.1

Deepseek Ai/DeepSeek V3.1 is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.2700/1M input tokens, $1.00/1M output tokens.

Input: $0.27/1M Output: $1.00/1M Context: 164K
text function calling reasoning

Deepseek Ai/DeepSeek V3.1 Terminus

Deepseek Ai/DeepSeek V3.1 Terminus is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.2700/1M input tokens, $1.00/1M output tokens.

Input: $0.27/1M Output: $1.00/1M Context: 164K
text function calling

Qwen/Qwen3 Coder 480B A35B Instruct Turbo

Qwen/Qwen3 Coder 480B A35B Instruct Turbo is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.2900/1M input tokens, $1.20/1M output tokens.

Input: $0.29/1M Output: $1.20/1M Context: 262K
text function calling

NousResearch/Hermes 3 Llama 3.1 70B

NousResearch/Hermes 3 Llama 3.1 70B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.3000/1M input tokens, $0.3000/1M output tokens.

Input: $0.30/1M Output: $0.30/1M Context: 131K
text

Qwen/Qwen3 235B A22B Thinking 2507

Qwen/Qwen3 235B A22B Thinking 2507 is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.3000/1M input tokens, $2.90/1M output tokens.

Input: $0.30/1M Output: $2.90/1M Context: 262K
text function calling

Google/Gemini 2.5 Flash

Google/Gemini 2.5 Flash is available via DeepInfra with a 1M context window and up to 1,000,000 output tokens. Pricing: $0.3000/1M input tokens, $2.50/1M output tokens.

Input: $0.30/1M Output: $2.50/1M Context: 1M
text function calling

Deepseek Ai/DeepSeek V3

Deepseek Ai/DeepSeek V3 is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.3800/1M input tokens, $0.8900/1M output tokens.

Input: $0.38/1M Output: $0.89/1M Context: 164K
text function calling

Qwen/Qwen3 Coder 480B A35B Instruct

Qwen/Qwen3 Coder 480B A35B Instruct is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.4000/1M input tokens, $1.60/1M output tokens.

Input: $0.40/1M Output: $1.60/1M Context: 262K
text function calling

Meta Llama/Meta Llama 3.1 70B Instruct

Meta Llama/Meta Llama 3.1 70B Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.4000/1M input tokens, $0.4000/1M output tokens.

Input: $0.40/1M Output: $0.40/1M Context: 131K
text function calling

Mistralai/Mixtral 8x7B Instruct V0.1

Mistralai/Mixtral 8x7B Instruct V0.1 is available via DeepInfra with a 33K context window and up to 32,768 output tokens. Pricing: $0.4000/1M input tokens, $0.4000/1M output tokens.

Input: $0.40/1M Output: $0.40/1M Context: 33K
text function calling

Zai Org/GLM 4.5

Zai Org/GLM 4.5 is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.4000/1M input tokens, $1.60/1M output tokens.

Input: $0.40/1M Output: $1.60/1M Context: 131K
text function calling

Microsoft/WizardLM 2 8x22B

Microsoft/WizardLM 2 8x22B is available via DeepInfra with a 66K context window and up to 65,536 output tokens. Pricing: $0.4800/1M input tokens, $0.4800/1M output tokens.

Input: $0.48/1M Output: $0.48/1M Context: 66K
text

Deepseek Ai/DeepSeek R1 0528

Deepseek Ai/DeepSeek R1 0528 is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.5000/1M input tokens, $2.15/1M output tokens.

Input: $0.50/1M Output: $2.15/1M Context: 164K
text function calling

Moonshotai/Kimi K2 Instruct

Moonshotai/Kimi K2 Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.5000/1M input tokens, $2.00/1M output tokens.

Input: $0.50/1M Output: $2.00/1M Context: 131K
text function calling

Moonshotai/Kimi K2 Instruct 0905

Moonshotai/Kimi K2 Instruct 0905 is available via DeepInfra with a 262K context window and up to 262,144 output tokens. Pricing: $0.5000/1M input tokens, $2.00/1M output tokens.

Input: $0.50/1M Output: $2.00/1M Context: 262K
text function calling

Nvidia/Llama 3.1 Nemotron 70B Instruct

Nvidia/Llama 3.1 Nemotron 70B Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.6000/1M input tokens, $0.6000/1M output tokens.

Input: $0.60/1M Output: $0.60/1M Context: 131K
text function calling

Sao10K/L3.1 70B Euryale V2.2

Sao10K/L3.1 70B Euryale V2.2 is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.6500/1M input tokens, $0.7500/1M output tokens.

Input: $0.65/1M Output: $0.75/1M Context: 131K
text

Sao10K/L3.3 70B Euryale V2.3

Sao10K/L3.3 70B Euryale V2.3 is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.6500/1M input tokens, $0.7500/1M output tokens.

Input: $0.65/1M Output: $0.75/1M Context: 131K
text

Deepseek Ai/DeepSeek R1

Deepseek Ai/DeepSeek R1 is available via DeepInfra with a 164K context window and up to 163,840 output tokens. Pricing: $0.7000/1M input tokens, $2.40/1M output tokens.

Input: $0.70/1M Output: $2.40/1M Context: 164K
text function calling

NousResearch/Hermes 3 Llama 3.1 405B

NousResearch/Hermes 3 Llama 3.1 405B is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $1.00/1M input tokens, $1.00/1M output tokens.

Input: $1.00/1M Output: $1.00/1M Context: 131K
text function calling

Deepseek Ai/DeepSeek R1 0528 Turbo

Deepseek Ai/DeepSeek R1 0528 Turbo is available via DeepInfra with a 33K context window and up to 32,768 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 33K
text function calling

Deepseek Ai/DeepSeek R1 Turbo

Deepseek Ai/DeepSeek R1 Turbo is available via DeepInfra with a 41K context window and up to 40,960 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 41K
text function calling

Google/Gemini 2.5 Pro

Google/Gemini 2.5 Pro is available via DeepInfra with a 1M context window and up to 1,000,000 output tokens. Pricing: $1.25/1M input tokens, $10.00/1M output tokens.

Input: $1.25/1M Output: $10.00/1M Context: 1M
text function calling

Anthropic/Claude 3 7 Sonnet Latest

Anthropic/Claude 3 7 Sonnet Latest is available via DeepInfra with a 200K context window and up to 200,000 output tokens. Pricing: $3.30/1M input tokens, $16.50/1M output tokens.

Input: $3.30/1M Output: $16.50/1M Context: 200K
text function calling

Anthropic/Claude 4 Sonnet

Anthropic/Claude 4 Sonnet is available via DeepInfra with a 200K context window and up to 200,000 output tokens. Pricing: $3.30/1M input tokens, $16.50/1M output tokens.

Input: $3.30/1M Output: $16.50/1M Context: 200K
text function calling

Anthropic/Claude 4 Opus

Anthropic/Claude 4 Opus is available via DeepInfra with a 200K context window and up to 200,000 output tokens. Pricing: $16.50/1M input tokens, $82.50/1M output tokens.

Input: $16.50/1M Output: $82.50/1M Context: 200K
text function calling

Compare DeepInfra model pricing

Use our pricing calculator to find the cheapest DeepInfra model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →