Skip to content

Nebius Models

Nebius provides 27 AI models accessible via API.

Visit Nebius →

27

Models Available

$0.010

Cheapest Input / 1M

262K

Largest Context

What is Nebius?

Nebius is an AI model provider offering 27 large language models for developers. Their cheapest model starts at $0.010 per 1M input tokens, and their largest context window reaches 262K. Nebius provides 27 AI models accessible via API.

Nebius Strengths

All Nebius Models

Model Input $/1M Output $/1M Context Max Output Released
Qwen/Qwen2.5 Coder 7B $0.010 $0.030 33K 32,768
Meta Llama/Llama Guard 3 8B $0.020 $0.060 128K 128,000
Meta Llama/Meta Llama 3.1 8B Instruct $0.020 $0.060 128K 128,000
Qwen/Qwen2 VL 7B Instruct $0.020 $0.060 131K 131,072
Mistralai/Mistral Nemo Instruct 2407 $0.040 $0.12 128K 128,000
Google/Gemma 3 27b It $0.060 $0.20 128K 128,000
Qwen/Qwen2.5 32B Instruct $0.060 $0.20 128K 128,000
Qwen/Qwen3 14B $0.080 $0.24 33K 32,768
Qwen/Qwen3 4B $0.080 $0.24 33K 32,768
Nvidia/Llama 3.3 Nemotron Super 49B $0.10 $0.40 131K 131,072
Qwen/Qwen3 32B $0.10 $0.30 33K 32,768
Qwen/Qwen3 30B A3B $0.10 $0.30 33K 32,768
Meta Llama/Llama 3.3 70B Instruct $0.13 $0.40 128K 128,000
Meta Llama/Meta Llama 3.1 70B Instruct $0.13 $0.40 128K 128,000
Qwen/Qwen2.5 72B Instruct $0.13 $0.40 128K 128,000
Qwen/Qwen2.5 VL 72B Instruct $0.13 $0.40 131K 131,072
Qwen/Qwen2 VL 72B Instruct $0.13 $0.40 131K 131,072
Qwen/QwQ 32B $0.15 $0.45 33K 32,768
Qwen/Qwen3 235B A22B $0.20 $0.60 262K 262,144
Deepseek Ai/DeepSeek R1 Distill Llama 70B $0.25 $0.75 128K 128,000
Deepseek Ai/DeepSeek V3 $0.50 $1.50 128K 128,000
Deepseek Ai/DeepSeek V3 0324 $0.50 $1.50 128K 128,000
Nvidia/Llama 3.1 Nemotron Ultra 253B $0.60 $1.80 128K 128,000
Deepseek Ai/DeepSeek R1 $0.80 $2.40 128K 128,000
Deepseek Ai/DeepSeek R1 0528 $0.80 $2.40 164K 164,000
Meta Llama/Meta Llama 3.1 405B Instruct $1.00 $3.00 128K 128,000
NousResearch/Hermes 3 Llama 3.1 405B $1.00 $3.00 128K 128,000

Model Details

Qwen/Qwen2.5 Coder 7B

Qwen/Qwen2.5 Coder 7B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0100/1M input tokens, $0.0300/1M output tokens.

Input: $0.010/1M Output: $0.030/1M Context: 33K
text function calling

Meta Llama/Llama Guard 3 8B

Meta Llama/Llama Guard 3 8B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.

Input: $0.020/1M Output: $0.060/1M Context: 128K
text

Meta Llama/Meta Llama 3.1 8B Instruct

Meta Llama/Meta Llama 3.1 8B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.

Input: $0.020/1M Output: $0.060/1M Context: 128K
text function calling

Qwen/Qwen2 VL 7B Instruct

Qwen/Qwen2 VL 7B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.

Input: $0.020/1M Output: $0.060/1M Context: 131K
text vision

Mistralai/Mistral Nemo Instruct 2407

Mistralai/Mistral Nemo Instruct 2407 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0400/1M input tokens, $0.1200/1M output tokens.

Input: $0.040/1M Output: $0.12/1M Context: 128K
text function calling

Google/Gemma 3 27b It

Google/Gemma 3 27b It is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0600/1M input tokens, $0.2000/1M output tokens.

Input: $0.060/1M Output: $0.20/1M Context: 128K
text vision function calling

Qwen/Qwen2.5 32B Instruct

Qwen/Qwen2.5 32B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0600/1M input tokens, $0.2000/1M output tokens.

Input: $0.060/1M Output: $0.20/1M Context: 128K
text function calling

Qwen/Qwen3 14B

Qwen/Qwen3 14B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0800/1M input tokens, $0.2400/1M output tokens.

Input: $0.080/1M Output: $0.24/1M Context: 33K
text function calling

Qwen/Qwen3 4B

Qwen/Qwen3 4B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0800/1M input tokens, $0.2400/1M output tokens.

Input: $0.080/1M Output: $0.24/1M Context: 33K
text function calling

Nvidia/Llama 3.3 Nemotron Super 49B

Nvidia/Llama 3.3 Nemotron Super 49B is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 131K
text function calling

Qwen/Qwen3 32B

Qwen/Qwen3 32B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1000/1M input tokens, $0.3000/1M output tokens.

Input: $0.10/1M Output: $0.30/1M Context: 33K
text function calling

Qwen/Qwen3 30B A3B

Qwen/Qwen3 30B A3B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1000/1M input tokens, $0.3000/1M output tokens.

Input: $0.10/1M Output: $0.30/1M Context: 33K
text function calling

Meta Llama/Llama 3.3 70B Instruct

Meta Llama/Llama 3.3 70B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 128K
text function calling

Meta Llama/Meta Llama 3.1 70B Instruct

Meta Llama/Meta Llama 3.1 70B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 128K
text function calling

Qwen/Qwen2.5 72B Instruct

Qwen/Qwen2.5 72B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 128K
text function calling

Qwen/Qwen2.5 VL 72B Instruct

Qwen/Qwen2.5 VL 72B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 131K
text vision function calling

Qwen/Qwen2 VL 72B Instruct

Qwen/Qwen2 VL 72B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.

Input: $0.13/1M Output: $0.40/1M Context: 131K
text vision function calling

Qwen/QwQ 32B

Qwen/QwQ 32B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1500/1M input tokens, $0.4500/1M output tokens.

Input: $0.15/1M Output: $0.45/1M Context: 33K
text function calling reasoning

Qwen/Qwen3 235B A22B

Qwen/Qwen3 235B A22B is available via Nebius with a 262K context window and up to 262,144 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 262K
text function calling

Deepseek Ai/DeepSeek R1 Distill Llama 70B

Deepseek Ai/DeepSeek R1 Distill Llama 70B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.2500/1M input tokens, $0.7500/1M output tokens.

Input: $0.25/1M Output: $0.75/1M Context: 128K
text function calling

Deepseek Ai/DeepSeek V3

Deepseek Ai/DeepSeek V3 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.5000/1M input tokens, $1.50/1M output tokens.

Input: $0.50/1M Output: $1.50/1M Context: 128K
text function calling

Deepseek Ai/DeepSeek V3 0324

Deepseek Ai/DeepSeek V3 0324 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.5000/1M input tokens, $1.50/1M output tokens.

Input: $0.50/1M Output: $1.50/1M Context: 128K
text function calling

Nvidia/Llama 3.1 Nemotron Ultra 253B

Nvidia/Llama 3.1 Nemotron Ultra 253B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.6000/1M input tokens, $1.80/1M output tokens.

Input: $0.60/1M Output: $1.80/1M Context: 128K
text function calling

Deepseek Ai/DeepSeek R1

Deepseek Ai/DeepSeek R1 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.8000/1M input tokens, $2.40/1M output tokens.

Input: $0.80/1M Output: $2.40/1M Context: 128K
text function calling reasoning

Deepseek Ai/DeepSeek R1 0528

Deepseek Ai/DeepSeek R1 0528 is available via Nebius with a 164K context window and up to 164,000 output tokens. Pricing: $0.8000/1M input tokens, $2.40/1M output tokens.

Input: $0.80/1M Output: $2.40/1M Context: 164K
text function calling reasoning

Meta Llama/Meta Llama 3.1 405B Instruct

Meta Llama/Meta Llama 3.1 405B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 128K
text function calling

NousResearch/Hermes 3 Llama 3.1 405B

NousResearch/Hermes 3 Llama 3.1 405B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 128K
text function calling

Compare Nebius model pricing

Use our pricing calculator to find the cheapest Nebius model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →