Nebius Models

Nebius provides 27 AI models accessible via API.

Visit Nebius →

27

Models Available

$0.010

Cheapest Input / 1M

262K

Largest Context

What is Nebius?

Nebius is an AI model provider offering 27 large language models for developers. Their cheapest model starts at $0.010 per 1M input tokens, and their largest context window reaches 262K. Nebius provides 27 AI models accessible via API.

Nebius Strengths

All Nebius Models

Model	Input $/1M	Output $/1M	Context	Max Output	Released
Qwen/Qwen2.5 Coder 7B	$0.010	$0.030	33K	32,768	—
Meta Llama/Llama Guard 3 8B	$0.020	$0.060	128K	128,000	—
Meta Llama/Meta Llama 3.1 8B Instruct	$0.020	$0.060	128K	128,000	—
Qwen/Qwen2 VL 7B Instruct	$0.020	$0.060	131K	131,072	—
Mistralai/Mistral Nemo Instruct 2407	$0.040	$0.12	128K	128,000	—
Google/Gemma 3 27b It	$0.060	$0.20	128K	128,000	—
Qwen/Qwen2.5 32B Instruct	$0.060	$0.20	128K	128,000	—
Qwen/Qwen3 14B	$0.080	$0.24	33K	32,768	—
Qwen/Qwen3 4B	$0.080	$0.24	33K	32,768	—
Nvidia/Llama 3.3 Nemotron Super 49B	$0.10	$0.40	131K	131,072	—
Qwen/Qwen3 32B	$0.10	$0.30	33K	32,768	—
Qwen/Qwen3 30B A3B	$0.10	$0.30	33K	32,768	—
Meta Llama/Llama 3.3 70B Instruct	$0.13	$0.40	128K	128,000	—
Meta Llama/Meta Llama 3.1 70B Instruct	$0.13	$0.40	128K	128,000	—
Qwen/Qwen2.5 72B Instruct	$0.13	$0.40	128K	128,000	—
Qwen/Qwen2.5 VL 72B Instruct	$0.13	$0.40	131K	131,072	—
Qwen/Qwen2 VL 72B Instruct	$0.13	$0.40	131K	131,072	—
Qwen/QwQ 32B	$0.15	$0.45	33K	32,768	—
Qwen/Qwen3 235B A22B	$0.20	$0.60	262K	262,144	—
Deepseek Ai/DeepSeek R1 Distill Llama 70B	$0.25	$0.75	128K	128,000	—
Deepseek Ai/DeepSeek V3	$0.50	$1.50	128K	128,000	—
Deepseek Ai/DeepSeek V3 0324	$0.50	$1.50	128K	128,000	—
Nvidia/Llama 3.1 Nemotron Ultra 253B	$0.60	$1.80	128K	128,000	—
Deepseek Ai/DeepSeek R1	$0.80	$2.40	128K	128,000	—
Deepseek Ai/DeepSeek R1 0528	$0.80	$2.40	164K	164,000	—
Meta Llama/Meta Llama 3.1 405B Instruct	$1.00	$3.00	128K	128,000	—
NousResearch/Hermes 3 Llama 3.1 405B	$1.00	$3.00	128K	128,000	—

Model Details

Qwen/Qwen2.5 Coder 7B

Qwen/Qwen2.5 Coder 7B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0100/1M input tokens, $0.0300/1M output tokens.

Input: $0.010/1M Output: $0.030/1M Context: 33K

text function calling

Meta Llama/Llama Guard 3 8B

Meta Llama/Llama Guard 3 8B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.

Input: $0.020/1M Output: $0.060/1M Context: 128K

Meta Llama/Meta Llama 3.1 8B Instruct

Meta Llama/Meta Llama 3.1 8B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.

Input: $0.020/1M Output: $0.060/1M Context: 128K

text function calling

Qwen/Qwen2 VL 7B Instruct

Qwen/Qwen2 VL 7B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.

Input: $0.020/1M Output: $0.060/1M Context: 131K

Mistralai/Mistral Nemo Instruct 2407

Mistralai/Mistral Nemo Instruct 2407 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0400/1M input tokens, $0.1200/1M output tokens.

Input: $0.040/1M Output: $0.12/1M Context: 128K

text function calling

Google/Gemma 3 27b It

Google/Gemma 3 27b It is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0600/1M input tokens, $0.2000/1M output tokens.

Input: $0.060/1M Output: $0.20/1M Context: 128K

text vision function calling

Qwen/Qwen2.5 32B Instruct

Qwen/Qwen2.5 32B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0600/1M input tokens, $0.2000/1M output tokens.

Input: $0.060/1M Output: $0.20/1M Context: 128K

text function calling

Qwen/Qwen3 14B

Qwen/Qwen3 14B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0800/1M input tokens, $0.2400/1M output tokens.

Input: $0.080/1M Output: $0.24/1M Context: 33K

text function calling

Qwen/Qwen3 4B

Qwen/Qwen3 4B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0800/1M input tokens, $0.2400/1M output tokens.

Input: $0.080/1M Output: $0.24/1M Context: 33K

text function calling

Nvidia/Llama 3.3 Nemotron Super 49B

Nvidia/Llama 3.3 Nemotron Super 49B is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 131K

text function calling

Qwen/Qwen3 32B

Qwen/Qwen3 32B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1000/1M input tokens, $0.3000/1M output tokens.

Input: $0.10/1M Output: $0.30/1M Context: 33K

text function calling

Qwen/Qwen3 30B A3B

Qwen/Qwen3 30B A3B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1000/1M input tokens, $0.3000/1M output tokens.

Input: $0.10/1M Output: $0.30/1M Context: 33K

text function calling

Meta Llama/Llama 3.3 70B Instruct

Meta Llama/Llama 3.3 70B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 128K

text function calling

Meta Llama/Meta Llama 3.1 70B Instruct

Meta Llama/Meta Llama 3.1 70B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 128K

text function calling

Qwen/Qwen2.5 72B Instruct

Qwen/Qwen2.5 72B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 128K

text function calling

Qwen/Qwen2.5 VL 72B Instruct

Qwen/Qwen2.5 VL 72B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 131K

text vision function calling

Qwen/Qwen2 VL 72B Instruct

Qwen/Qwen2 VL 72B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 131K

text vision function calling

Qwen/QwQ 32B

Qwen/QwQ 32B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1500/1M input tokens, $0.4500/1M output tokens.

Input: $0.15/1M Output: $0.45/1M Context: 33K

text function calling reasoning

Qwen/Qwen3 235B A22B

Qwen/Qwen3 235B A22B is available via Nebius with a 262K context window and up to 262,144 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 262K

text function calling

Deepseek Ai/DeepSeek R1 Distill Llama 70B

Deepseek Ai/DeepSeek R1 Distill Llama 70B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.2500/1M input tokens, $0.7500/1M output tokens.

Input: $0.25/1M Output: $0.75/1M Context: 128K

text function calling

Deepseek Ai/DeepSeek V3

Deepseek Ai/DeepSeek V3 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.5000/1M input tokens, $1.50/1M output tokens.

Input: $0.50/1M Output: $1.50/1M Context: 128K

text function calling

Deepseek Ai/DeepSeek V3 0324

Deepseek Ai/DeepSeek V3 0324 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.5000/1M input tokens, $1.50/1M output tokens.

Input: $0.50/1M Output: $1.50/1M Context: 128K

text function calling

Nvidia/Llama 3.1 Nemotron Ultra 253B

Nvidia/Llama 3.1 Nemotron Ultra 253B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.6000/1M input tokens, $1.80/1M output tokens.

Input: $0.60/1M Output: $1.80/1M Context: 128K

text function calling

Deepseek Ai/DeepSeek R1

Deepseek Ai/DeepSeek R1 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.8000/1M input tokens, $2.40/1M output tokens.

Input: $0.80/1M Output: $2.40/1M Context: 128K

text function calling reasoning

Deepseek Ai/DeepSeek R1 0528

Deepseek Ai/DeepSeek R1 0528 is available via Nebius with a 164K context window and up to 164,000 output tokens. Pricing: $0.8000/1M input tokens, $2.40/1M output tokens.

Input: $0.80/1M Output: $2.40/1M Context: 164K

text function calling reasoning

Meta Llama/Meta Llama 3.1 405B Instruct

Meta Llama/Meta Llama 3.1 405B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 128K

text function calling

NousResearch/Hermes 3 Llama 3.1 405B

NousResearch/Hermes 3 Llama 3.1 405B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 128K

text function calling

Compare Nebius model pricing

Use our pricing calculator to find the cheapest Nebius model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →