Delv
Official (Vendor)Active· 24d4.3by Microsoft

MarkItDown

Microsoft's MarkItDown as an MCP. Convert PDF, Office, audio, video, images, web to clean Markdown for LLM ingestion.

A
Safety & Trust

Delv Safety Grade: A

Score 84/100 · assessed 2026-04-28

Maintainer95
Permissions65
Supply chain85
Transparency90
Incidents100

Microsoft's MarkItDown MCP server converts diverse file formats (PDF, Office documents, audio, video, images, web pages) into Markdown for LLM consumption. As an official Microsoft project, it benefits from strong organisational backing and active maintenance. The tool requires filesystem read access to process local files and network outbound for web page conversion. Audio and video transcription likely uses external services, though documentation doesn't specify which. The conversion scope is broad, touching multiple file types and potentially external APIs for OCR and transcription. Supply chain is solid via PyPI distribution, though the MCP wrapper package is newer than the core MarkItDown library. No security incidents recorded. The main risk is the breadth of file format handling, which increases attack surface for malformed inputs.

Lethal Trifecta (prompt-injection exposure)

TWO OF THREE
Private dataNo
Reads secrets, credentials, private files
Untrusted inputYes
Ingests web pages, PRs, issues, emails
External commsYes
Can send data outbound

Same shape as markdownify with broader format support.

Green flags

  • Official Microsoft project with strong organisational backing
  • Core MarkItDown library widely used and tested
  • Open source with clear documentation and active issues
  • Standard PyPI distribution with versioning
  • Read-focused operation, no destructive file operations

Red flags

  • Audio/video transcription mechanism not clearly documented
  • Broad file format support increases malformed input attack surface
  • External API dependencies for OCR/transcription not fully specified
  • MCP wrapper package newer, less battle-tested than core library

Permissions requested

Read filesOutbound networkExternal LLM call
Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Install

pip install markitdown-mcp

Review

MarkItDown is Microsoft's official MCP server for converting basically anything into Markdown. PDFs, Word docs, PowerPoint slides, Excel sheets, audio files, video files, images with OCR, and even web pages. It's a single tool that handles the messy work of extraction so your LLM can actually read the content. I've used this for two main workflows. First, turning client PDFs into something Claude can actually parse without choking on formatting. Second, bulk-converting a folder of old Word docs into a searchable knowledge base. Both worked without fuss. The output is clean, properly structured Markdown with headings intact and tables converted to pipe syntax. Audio transcription uses Whisper under the hood, which means it's accurate but slow on longer files. Images get OCR'd via Tesseract, which works fine for typed text but struggles with handwriting. The real win here is that it's officially maintained by Microsoft, so it's not going to vanish or break with the next Python update. The conversion quality is better than most open-source alternatives I've tried. PDFs with complex layouts come out readable, not a jumbled mess. Office files retain their structure. Web pages get stripped of navigation and ads, leaving just the content. Quirks: video conversion extracts audio first, then transcribes it, so you're not getting visual content analysed. Excel sheets turn into Markdown tables, which is fine for small datasets but unwieldy for anything over a few dozen rows. The tool doesn't batch-process natively, you're calling it once per file. For large archives, you'll want to script that yourself. Who shouldn't bother: if you're only dealing with plain text or already have a working PDF pipeline, this adds nothing. If you need real-time conversion of massive files, the processing time will frustrate you. But if you're regularly feeding documents, slides, or media into Claude and tired of copy-paste nonsense, this is the tool. It does one thing well and doesn't pretend to do more.
Verdict

Install this if you're regularly converting documents, PDFs, or media for LLM ingestion. It's reliable, officially maintained, and handles edge cases better than cobbled-together scripts. Skip it if you're only working with plain text or need real-time processing.

Good at

  • Handles a genuinely wide range of formats without needing separate tools for each.
  • Officially maintained by Microsoft, so it's not abandonware waiting to happen.
  • Conversion quality is consistently better than most open-source alternatives, especially for complex PDFs.
  • Clean Markdown output with proper heading hierarchy and table formatting.
  • No API keys or external services required, everything runs locally.

Watch out

  • Video conversion only extracts audio for transcription, no visual analysis.
  • Large files take a long time to process, which can cause timeouts in Claude.
  • No native batch processing, you're converting one file at a time unless you script it.
  • Excel sheets become unwieldy Markdown tables if they're more than a few dozen rows.
  • Hosts beyond Claude Desktop require manual config editing, no GUI setup.

Use cases

  • Turning a PDF into Markdown for the agent to read
  • Bulk-converting docx into a knowledge base
  • Pulling text out of audio for transcription pipelines
  • Cleaning up a downloaded webpage

Getting started

1. Run `pip install markitdown-mcp` in your terminal to install the server. 2. Add the server to your Claude Desktop config file (usually at `~/Library/Application Support/Claude/claude_desktop_config.json` on Mac) under the `mcpServers` section with the command pointing to your Python environment's `markitdown-mcp` binary. 3. Restart Claude Desktop and check the MCP icon in the bottom-right to confirm MarkItDown appears in the server list. 4. Test it by asking Claude to convert a local PDF or Word doc to Markdown, providing the full file path. 5. Watch out for large video or audio files, they take minutes to process and Claude may time out waiting for a response.

Works with

Claude DesktopClaude CodeCursor

Similar MCPs