MCP Directory
Back

kreuzberg

by kreuzberg-dev · Rust · ★ 8,159

A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.

#bun#csharp#document-intelligence#elixir#ffi#golang#java#metadata-extraction#node#pdf-extraction#pdfium#php#python#rag#ruby#rust#table-extraction#tesseract#text-extraction#wasm

Install

cargo install --git https://github.com/kreuzberg-dev/kreuzberg.git

Claude Desktop config

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "kreuzberg": {
      "command": "npx",
      "args": [
        "-y",
        "github:kreuzberg-dev/kreuzberg"
      ]
    }
  }
}

From the README

Extract text, metadata, and code intelligence from 97+ file formats and 305 programming languages at native speeds without needing a GPU. - **Code intelligence** – Extract functions, classes, imports, symbols, and docstrings from [248 programming languages](https://docs.tree-sitter-language-pack.kreuzberg.dev) via tree-sitter. Results in with semantic chunking - **Extensible architecture** – Plugin system for custom OCR backends, validators, post-processors, document extractors, and renderers - **Polyglot** – Native bindings for Rust, Python, TypeScript/Node.js, Ruby, Go, Java, C#, PHP, Elix…
Read full README on GitHub →

💡 Need a managed MCP host?

Try Claude Pro for the smoothest MCP experience, or browse our cloud-hosted servers.

Related ai & ml servers