Pedro Bertoluchi

Choosing a vector store for .NET RAG: AI Search vs Cosmos vs Postgres

A technical decision that locks in cost, latency and portability for years, framed by the three Azure-native options worth shortlisting.

8 min read
Back to blog

There is no universal vector store for .NET RAG on Azure. Three options cover almost every serious workload, and each one wins a different fight. Pick wrong and you either pay too much, ship a weak ranker, or trap the corpus inside a region you cannot leave. The decision is structural, not cosmetic, and rewriting it later costs a quarter.

Azure AI Search is the default when hybrid search and semantic ranking matter and the team has no appetite to build a ranker. BM25, vector and semantic reranker arrive in one product, with filters, facets and synonyms that behave. The price is real: semantic ranker is available from Basic up, but Standard tiers from S1 climb fast once semantic ranker usage compounds, replicas and partitions are billed separately, and the ranker itself is a metered add-on. The internals are also opaque, which makes tuning a question of trial and not of theory.

Cosmos DB for NoSQL with the integrated vector index wins when the embedding lives next to the operational document and multi-region latency is the constraint. Co-locating the chunk, the metadata and the source record kills a join and a network hop, which matters at p95 under heavy write load. The weak spot is filtering: composite filter plus vector search still does not match what AI Search ships, and the DiskANN index can surprise you on partitions with skew.

Postgres with pgvector and a Flexible Server is the choice when control, predictable cost and exit options outweigh convenience. HNSW gives you the recall curve, you own the ranking with a SQL expression, and the bill is a VM plus storage instead of a metered search unit. You also keep the door open to leave Azure without rewriting retrieval. The cost is operational: index build time, vacuum tuning, and the fact that hybrid search is something you assemble from tsvector plus vector, not a feature you toggle.

The decision criteria are concrete. Volume under five million chunks with rich filters and a small team points to AI Search. Tens of millions of chunks already in Cosmos with strict multi-region SLAs points to Cosmos vector. Anything where the bill needs to be flat, the ranker needs to be inspectable, or the corpus may move clouds points to pgvector. Multi-tenant changes the math: AI Search wants an index per tenant or a filter pattern that hurts recall, Cosmos partitions naturally by tenant key, Postgres handles it with schemas or a tenant column and partial indexes.

Lock-in is the quiet variable. AI Search and Cosmos are excellent products and complete dead ends outside Azure. Postgres is portable to RDS, to Aurora, to a VM in a closet. For a one-year contract on an Azure-committed customer that does not change the answer. For a five-year platform that may be acquired, it changes everything. Decide with that horizon in mind and write the reason down in the architecture record so the next engineer does not relitigate it.

Tags

  • #azure
  • #rag
  • #applied-ai
  • #architecture

Let's talk about your next project.

Share the challenge in a few lines. Within one business day I respond with a technical assessment and the next steps.