100% client-side. Your prompts never leave your browser.

RAG Pipeline Cost Calculator

Calculate the full cost of a RAG pipeline: document embedding, vector database storage, retrieval, and LLM generation. See cost breakdowns per stage and per query.

FreeNo SignupNo UploadsNo Tracking

Documents→Chunking→Embedding→Vector DB→Retrieval→Generation

Documents

Avg doc length (words)

Chunk size (tokens)

Chunk overlap (tokens)

Embedding model

Vector database

Queries / day

Top-K retrieval

Generation model

Avg query length (tokens)

Avg response length (tokens)

Total Chunks

6,000

Total Tokens

3,000,000

Monthly Queries

3,000

Monthly Cost

$73.10

Cost / Query

$0.0244

Monthly Cost Breakdown

Document Embedding

$0.0600

Vector DB

$70.00

Query Embedding

$0.003000

Generation (LLM)

$3.04

Stage	Monthly Cost	% of Total
Document Embedding	$0.0600	0.1%
Vector DB	$70.00	95.8%
Query Embedding	$0.003000	0.0%
Generation (LLM)	$3.04	4.2%
Total	$73.10	100%

Embedding cost assumes re-indexing 1,000 documents monthly. Vector DB cost is a fixed monthly fee. Generation cost is based on 100 queries/day with 5 retrieved chunks of 500 tokens each. Actual costs may vary based on provider billing and volume discounts.

How to Use RAG Pipeline Cost Calculator

1
Configure your documents
Enter the number of documents, average length, chunk size, and overlap to estimate your embedding volume.
2
Select your models
Choose an embedding model, vector database, and generation model from the dropdowns.
3
Set query volume
Enter how many queries per day your pipeline will handle and the top-K retrieval count.
4
Review the breakdown
See per-stage costs, the visual bar chart, total monthly cost, and cost per query.

Frequently Asked Questions

What costs are included in a RAG pipeline?

A RAG pipeline has four main cost stages: (1) embedding your documents into vectors, (2) storing vectors in a database, (3) embedding each user query, and (4) generating answers with an LLM using retrieved context. This calculator estimates all four.

How is the embedding cost calculated?

Embedding cost = (total_chunks x chunk_size_tokens / 1,000,000) x price_per_million_tokens. We assume re-indexing once per month. Query embedding cost is calculated separately.

Are vector database costs accurate?

We use approximate monthly costs for managed tiers. Actual costs vary by plan, index size, and query volume. Self-hosted options show $0/month but have infrastructure costs.

What affects cost per query the most?

The generation model is typically the largest cost driver, since each query sends retrieved chunks (top-K x chunk_size tokens) plus the query itself to the LLM. Reducing chunk size or top-K lowers this cost.

RAG Pipeline Cost Calculator

How to Use RAG Pipeline Cost Calculator

Configure your documents

Select your models

Set query volume

Review the breakdown

Frequently Asked Questions

Related Tools

Embeddings Cost Calculator

API Cost Estimator

Token Counter

Pricing Calculator