docling

Get your documents ready for gen AI

open-sourceagent-frameworks
56.8k
Stars
+1260
Stars/month
10
Releases (6m)

Star Growth

+234 (0.4%)
55.5k56.7k58.0kMar 27Apr 1

Overview

Docling is an advanced document processing library designed to prepare documents for generative AI workflows. It excels at parsing diverse document formats including PDF, DOCX, PPTX, XLSX, HTML, audio files (WAV, MP3), WebVTT, images, LaTeX, and plain text. The tool's standout feature is its sophisticated PDF understanding capabilities, which include page layout analysis, reading order detection, table structure recognition, code extraction, formula processing, and image classification. Docling converts processed documents into a unified DoclingDocument representation, making it easier to integrate document content into AI pipelines. With over 56,000 GitHub stars, it has gained significant adoption in the AI community. The library provides seamless integrations with the generative AI ecosystem, enabling developers to efficiently extract and structure content from complex documents for downstream AI applications. As part of the Linux Foundation AI & Data project, Docling represents a robust, community-backed solution for document intelligence tasks.

Deep Analysis

Key Differentiator

Unlike LlamaParse (cloud-only, paid) or PyMuPDF (basic extraction), Docling runs fully locally, handles 20+ formats including audio and XML schemas, and produces a unified DoclingDocument representation with advanced PDF layout understanding backed by IBM Research.

Capabilities

  • Parse 20+ document formats including PDF, DOCX, PPTX, XLSX, HTML, images, LaTeX, audio (WAV/MP3), and XML schemas (USPTO, JATS, XBRL)
  • Advanced PDF understanding with page layout analysis, reading order detection, table structure extraction, code blocks, formulas, and image classification
  • Unified DoclingDocument representation format with export to Markdown, HTML, WebVTT, DocTags, and lossless JSON
  • Visual Language Model support via GraniteDocling for enhanced document understanding
  • Extensive OCR support for scanned PDFs and images with multiple OCR backends
  • MCP server for connecting document parsing to any AI agent
  • Local execution for sensitive data and air-gapped environments

🔗 Integrations

LangChainLlamaIndexCrew AIHaystackApify

Best For

  • Enterprise document processing pipelines needing high-fidelity PDF parsing with table/formula extraction
  • RAG applications that need to ingest diverse document formats into structured representations for LLM consumption

Not Ideal For

  • Simple text extraction from clean HTML — use BeautifulSoup or Firecrawl instead
  • Real-time web scraping workflows — use Firecrawl or ScrapeGraphAI instead

Languages

Python

Deployment

pip install (local)DockerCLI tool

Pricing Detail

Free: Fully open-source under MIT license
Paid: N/A

Known Limitations

  • Python-only — no JavaScript/TypeScript SDK
  • Chart understanding (bar/pie/line) and complex chemistry parsing still in development
  • Heavy PDF processing can be resource-intensive; GPU recommended for VLM pipelines
  • Requires Python 3.10+ (dropped 3.9 support in v2.70)

Pros

  • + Advanced PDF understanding with layout analysis, table structure recognition, and reading order detection
  • + Supports wide variety of document formats including office documents, images, audio, and markup languages
  • + Unified DoclingDocument representation simplifies integration with AI workflows and downstream processing

Cons

  • - Processing complex documents with advanced features may require significant computational resources
  • - Limited information available about performance benchmarks and processing speed for large document batches

Use Cases

  • Converting research papers and technical documents into AI-ready formats for RAG applications
  • Extracting structured data from business documents like invoices, contracts, and reports for automation
  • Preparing diverse document collections for training or fine-tuning language models

Getting Started

1. Install via pip: `pip install docling` 2. Import and create a document converter: `from docling.document_converter import DocumentConverter; converter = DocumentConverter()` 3. Process a document: `result = converter.convert('path/to/document.pdf')` to get structured DoclingDocument output

Compare docling