27
Models Available
$0.010
Cheapest Input / 1M
262K
Largest Context
What is Nebius?
Nebius is an AI model provider offering 27 large language models for developers. Their cheapest model starts at $0.010 per 1M input tokens, and their largest context window reaches 262K. Nebius provides 27 AI models accessible via API.
Nebius Strengths
All Nebius Models
| Model | Input $/1M | Output $/1M | Context | Max Output | Released |
|---|---|---|---|---|---|
| Qwen/Qwen2.5 Coder 7B | $0.010 | $0.030 | 33K | 32,768 | — |
| Meta Llama/Llama Guard 3 8B | $0.020 | $0.060 | 128K | 128,000 | — |
| Meta Llama/Meta Llama 3.1 8B Instruct | $0.020 | $0.060 | 128K | 128,000 | — |
| Qwen/Qwen2 VL 7B Instruct | $0.020 | $0.060 | 131K | 131,072 | — |
| Mistralai/Mistral Nemo Instruct 2407 | $0.040 | $0.12 | 128K | 128,000 | — |
| Google/Gemma 3 27b It | $0.060 | $0.20 | 128K | 128,000 | — |
| Qwen/Qwen2.5 32B Instruct | $0.060 | $0.20 | 128K | 128,000 | — |
| Qwen/Qwen3 14B | $0.080 | $0.24 | 33K | 32,768 | — |
| Qwen/Qwen3 4B | $0.080 | $0.24 | 33K | 32,768 | — |
| Nvidia/Llama 3.3 Nemotron Super 49B | $0.10 | $0.40 | 131K | 131,072 | — |
| Qwen/Qwen3 32B | $0.10 | $0.30 | 33K | 32,768 | — |
| Qwen/Qwen3 30B A3B | $0.10 | $0.30 | 33K | 32,768 | — |
| Meta Llama/Llama 3.3 70B Instruct | $0.13 | $0.40 | 128K | 128,000 | — |
| Meta Llama/Meta Llama 3.1 70B Instruct | $0.13 | $0.40 | 128K | 128,000 | — |
| Qwen/Qwen2.5 72B Instruct | $0.13 | $0.40 | 128K | 128,000 | — |
| Qwen/Qwen2.5 VL 72B Instruct | $0.13 | $0.40 | 131K | 131,072 | — |
| Qwen/Qwen2 VL 72B Instruct | $0.13 | $0.40 | 131K | 131,072 | — |
| Qwen/QwQ 32B | $0.15 | $0.45 | 33K | 32,768 | — |
| Qwen/Qwen3 235B A22B | $0.20 | $0.60 | 262K | 262,144 | — |
| Deepseek Ai/DeepSeek R1 Distill Llama 70B | $0.25 | $0.75 | 128K | 128,000 | — |
| Deepseek Ai/DeepSeek V3 | $0.50 | $1.50 | 128K | 128,000 | — |
| Deepseek Ai/DeepSeek V3 0324 | $0.50 | $1.50 | 128K | 128,000 | — |
| Nvidia/Llama 3.1 Nemotron Ultra 253B | $0.60 | $1.80 | 128K | 128,000 | — |
| Deepseek Ai/DeepSeek R1 | $0.80 | $2.40 | 128K | 128,000 | — |
| Deepseek Ai/DeepSeek R1 0528 | $0.80 | $2.40 | 164K | 164,000 | — |
| Meta Llama/Meta Llama 3.1 405B Instruct | $1.00 | $3.00 | 128K | 128,000 | — |
| NousResearch/Hermes 3 Llama 3.1 405B | $1.00 | $3.00 | 128K | 128,000 | — |
Model Details
Qwen/Qwen2.5 Coder 7B
Qwen/Qwen2.5 Coder 7B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0100/1M input tokens, $0.0300/1M output tokens.
Meta Llama/Llama Guard 3 8B
Meta Llama/Llama Guard 3 8B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.
Meta Llama/Meta Llama 3.1 8B Instruct
Meta Llama/Meta Llama 3.1 8B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.
Qwen/Qwen2 VL 7B Instruct
Qwen/Qwen2 VL 7B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0600/1M output tokens.
Mistralai/Mistral Nemo Instruct 2407
Mistralai/Mistral Nemo Instruct 2407 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0400/1M input tokens, $0.1200/1M output tokens.
Google/Gemma 3 27b It
Google/Gemma 3 27b It is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0600/1M input tokens, $0.2000/1M output tokens.
Qwen/Qwen2.5 32B Instruct
Qwen/Qwen2.5 32B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.0600/1M input tokens, $0.2000/1M output tokens.
Qwen/Qwen3 14B
Qwen/Qwen3 14B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0800/1M input tokens, $0.2400/1M output tokens.
Qwen/Qwen3 4B
Qwen/Qwen3 4B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.0800/1M input tokens, $0.2400/1M output tokens.
Nvidia/Llama 3.3 Nemotron Super 49B
Nvidia/Llama 3.3 Nemotron Super 49B is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.
Qwen/Qwen3 32B
Qwen/Qwen3 32B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1000/1M input tokens, $0.3000/1M output tokens.
Qwen/Qwen3 30B A3B
Qwen/Qwen3 30B A3B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1000/1M input tokens, $0.3000/1M output tokens.
Meta Llama/Llama 3.3 70B Instruct
Meta Llama/Llama 3.3 70B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.
Meta Llama/Meta Llama 3.1 70B Instruct
Meta Llama/Meta Llama 3.1 70B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.
Qwen/Qwen2.5 72B Instruct
Qwen/Qwen2.5 72B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.
Qwen/Qwen2.5 VL 72B Instruct
Qwen/Qwen2.5 VL 72B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.
Qwen/Qwen2 VL 72B Instruct
Qwen/Qwen2 VL 72B Instruct is available via Nebius with a 131K context window and up to 131,072 output tokens. Pricing: $0.1300/1M input tokens, $0.4000/1M output tokens.
Qwen/QwQ 32B
Qwen/QwQ 32B is available via Nebius with a 33K context window and up to 32,768 output tokens. Pricing: $0.1500/1M input tokens, $0.4500/1M output tokens.
Qwen/Qwen3 235B A22B
Qwen/Qwen3 235B A22B is available via Nebius with a 262K context window and up to 262,144 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.
Deepseek Ai/DeepSeek R1 Distill Llama 70B
Deepseek Ai/DeepSeek R1 Distill Llama 70B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.2500/1M input tokens, $0.7500/1M output tokens.
Deepseek Ai/DeepSeek V3
Deepseek Ai/DeepSeek V3 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.5000/1M input tokens, $1.50/1M output tokens.
Deepseek Ai/DeepSeek V3 0324
Deepseek Ai/DeepSeek V3 0324 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.5000/1M input tokens, $1.50/1M output tokens.
Nvidia/Llama 3.1 Nemotron Ultra 253B
Nvidia/Llama 3.1 Nemotron Ultra 253B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.6000/1M input tokens, $1.80/1M output tokens.
Deepseek Ai/DeepSeek R1
Deepseek Ai/DeepSeek R1 is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $0.8000/1M input tokens, $2.40/1M output tokens.
Deepseek Ai/DeepSeek R1 0528
Deepseek Ai/DeepSeek R1 0528 is available via Nebius with a 164K context window and up to 164,000 output tokens. Pricing: $0.8000/1M input tokens, $2.40/1M output tokens.
Meta Llama/Meta Llama 3.1 405B Instruct
Meta Llama/Meta Llama 3.1 405B Instruct is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.
NousResearch/Hermes 3 Llama 3.1 405B
NousResearch/Hermes 3 Llama 3.1 405B is available via Nebius with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.
Compare Nebius model pricing
Use our pricing calculator to find the cheapest Nebius model for your workload.