Official (Vendor)Slow· 1mo★ 4.3by Zilliz

Milvus

Zilliz/Milvus official MCP. Lets agents store and query vectors against the Milvus engine for production-scale RAG.

View repo Homepage

Safety & Trust

Delv Safety Grade: B

Score 72/100 · assessed 2026-04-28

Maintainer85

Permissions75

Supply chain45

Transparency80

Incidents100

Milvus MCP server is officially maintained by Zilliz, the commercial entity behind the open-source Milvus vector database. This gives it strong organisational backing and production-grade engineering standards. The server provides scoped database operations for vector storage and retrieval, which is reasonably safe for a database connector. However, the supply chain score suffers because there's no npm or PyPI package; users must clone the repository and run via uv, which bypasses standard package verification. The server requires network access to a Milvus instance and can read/write vectors plus metadata, but permissions are appropriately scoped to database operations only. Transparency is good with open source code and Zilliz's established documentation practices. No known security incidents. The main concern is the manual installation process and the need to handle authentication tokens securely.

Lethal Trifecta (prompt-injection exposure)

ONE OF THREE

Private dataYes

Reads secrets, credentials, private files

Untrusted inputNo

Ingests web pages, PRs, issues, emails

External commsNo

Can send data outbound

Same.

Green flags

Official vendor (Zilliz) with production database pedigree
Scoped to database operations only, no filesystem or shell access
Open source with active Milvus community backing
Clear documentation of required credentials and connection parameters
Production-tested codebase from established vector DB vendor

Red flags

No package registry distribution, requires manual git clone and uv run
Requires MILVUS_TOKEN in environment, credential exposure risk
Network access to external database required, potential data exfiltration vector
No version pinning or signed releases in install instructions

Permissions requested

Outbound networkAccess secretsDB readDB write

Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Install

uv run src/mcp_server_milvus/server.py --milvus-uri http://localhost:19530

Env vars needed: MILVUS_URIMILVUS_TOKEN

Review

Milvus is Zilliz's official MCP server that plugs Claude or Cursor into a production-grade vector database. If you're building an agent that needs to remember thousands of documents or handle semantic search at scale, this is the real thing, not a toy. The server exposes collection management, vector insertion, and hybrid search (dense vectors plus metadata filters) through tool calls. I'd reach for this when prototyping a RAG pipeline that might actually ship, because Milvus handles billions of vectors in production environments. The workflow is straightforward: create a collection with your embedding dimension, insert vectors with metadata, then query by similarity or filter by fields. The server supports both Milvus Lite (local, no dependencies) and full Milvus clusters via URI and token. Hybrid search is where it shines: you can filter by date ranges, categories, or custom fields before running the vector search, which beats naive RAG every time. Quirks: you need to know your embedding dimension upfront, and the server assumes you're bringing your own embeddings (it doesn't generate them). The install command uses `uv run` directly on the source file, which is fine for local testing but means you'll want to wrap it in a proper service for production. The repo is sparse on examples, so expect to read Milvus docs to understand index types and search parameters. If you're already running Milvus elsewhere, this is a no-brainer. If you're not, Milvus Lite makes it easy to start local and scale later. Who shouldn't bother: anyone happy with a simpler vector store like Chroma or just using Claude's native context. Milvus is overkill if you're indexing a few dozen documents. But if you're building a multi-tenant agent, need sub-100ms search on millions of vectors, or want to separate your vector store from your LLM provider, this is the correct choice. The official vendor status means it'll track Milvus features as they ship.

Verdict

Install this if you're building RAG that needs to scale beyond a few thousand documents or you already run Milvus in production. Skip it if you're prototyping with small datasets or don't want to manage a vector database. The official backing and hybrid search make it the best MCP option for serious vector workloads.

Good at

Official Zilliz integration means it tracks Milvus features and won't go stale.
Hybrid search with metadata filters beats pure vector similarity for most real-world RAG tasks.
Supports both local Milvus Lite and production clusters without code changes.
Handles billions of vectors in production, so you won't outgrow it.
Exposes collection management, indexing, and search through clean tool calls.

Watch out

No embedding generation built in, so you need a separate pipeline to turn text into vectors.
Requires understanding Milvus concepts like index types and search parameters, which adds learning overhead.
Sparse documentation in the repo itself, you'll lean on Milvus docs for advanced features.
Install command runs source directly, not a packaged binary, so production deployments need wrapping.
Overkill for small datasets, simpler vector stores are faster to set up for prototypes.

Use cases

Production-grade vector store for an agent
Semantic search across documents
Hybrid search with metadata filters
Custom embedding pipelines

Getting started

1. Install with `uv run src/mcp_server_milvus/server.py --milvus-uri http://localhost:19530` (or point to a remote Milvus cluster). For local testing, install Milvus Lite via pip first. 2. Add the server to your Claude Desktop or Cursor config with `MILVUS_URI` and `MILVUS_TOKEN` environment variables. Use `http://localhost:19530` for local or your cluster endpoint for production. 3. In Claude, ask it to create a collection with a specific dimension (e.g., 1536 for OpenAI embeddings). Verify by listing collections. 4. Insert a few test vectors with metadata, then run a similarity search. Check that hybrid filters work by querying with a metadata condition. 5. Watch out: the server doesn't generate embeddings, so you'll need another tool or API to convert text to vectors before insertion.

Works with

Claude DesktopCursor