Home / OCR / Docling / Reference
Reference Information-oriented

Docling API Reference

Complete reference for classes, methods, configuration options, and constants.

DocumentConverter

Main class for converting documents.

from docling.document_converter import DocumentConverter

converter = DocumentConverter(
    format_options: dict = None,  # Per-format configuration
)

Methods

Method Parameters Returns
convert(source) source: str | Path | URL ConversionResult
convert_batch(sources) sources: Iterable[str | Path] Iterator[ConversionResult]

PdfPipelineOptions

Configuration for PDF processing pipeline.

from docling.datamodel.pipeline_options import PdfPipelineOptions

options = PdfPipelineOptions()
Parameter Type Default Description
do_ocr bool True Enable OCR for scanned content
do_table_structure bool True Enable table detection
ocr_options OcrOptions RapidOcrOptions() OCR engine configuration
images_scale float 1.0 Scale factor for images
table_structure_options TableStructureOptions - Table parsing options

OCR Engine Options

EasyOcrOptions

Parameter Type Default
lang list[str] ["en"]
use_gpu bool True
confidence_threshold float 0.5
force_full_page_ocr bool False

RapidOcrOptions

Parameter Type Description
det_model_path str Path to detection model
rec_model_path str Path to recognition model
cls_model_path str Path to classification model
force_full_page_ocr bool Force OCR on all pages

TesseractOcrOptions

Parameter Type Description
lang str Language code (e.g., "eng")
force_full_page_ocr bool Force OCR on all pages

Requires: TESSDATA_PREFIX environment variable

VLM Model Specifications

Pre-configured model specs for VLM pipeline.

from docling.datamodel import vlm_model_specs
Constant Model Backend Best For
SMOLDOCLING_MLX SmolDocling-256M MLX Apple Silicon
SMOLDOCLING_TRANSFORMERS SmolDocling-256M Transformers CPU / CUDA
GRANITEDOCLING_MLX Granite-Docling-258M MLX Apple Silicon
GRANITEDOCLING_VLLM Granite-Docling-258M vLLM NVIDIA GPU (fastest)
GRANITE_VISION_TRANSFORMERS Granite-Docling-258M Transformers CPU / CUDA

Model Specifications

Spec SmolDocling-256M Granite-Docling-258M
Parameters 256M 258M
Vision Encoder SigLIP base (93M) siglip2-base-patch16-512
Language Model SmolLM-2 (135M) Granite 165M
Inference Speed ~0.35s/page (A100) ~0.35s/page (A100)
VRAM Usage ~489 MB ~500 MB
License Apache 2.0 Apache 2.0
HuggingFace ds4sd/SmolDocling-256M-preview ibm-granite/granite-docling-258M

Export Formats

Methods available on result.document:

Method Returns Use Case
export_to_markdown() str Human-readable, LLM input
export_to_html() str Web display
export_to_dict() dict JSON serialization, lossless
export_to_text() str Plain text extraction
export_to_doctags() str Native Docling format

Table Export Methods

Method Returns
table.export_to_markdown() str (Markdown table)
table.export_to_dataframe() pandas.DataFrame
table.export_to_html() str (HTML table)

Supported Input Formats

from docling.datamodel.base_models import InputFormat

Documents

  • InputFormat.PDF - PDF files
  • InputFormat.DOCX - Word documents
  • InputFormat.PPTX - PowerPoint
  • InputFormat.XLSX - Excel
  • InputFormat.HTML - Web pages

Media

  • InputFormat.IMAGE - PNG, JPG, TIFF
  • InputFormat.WAV - Audio (ASR)
  • InputFormat.MP3 - Audio (ASR)
  • InputFormat.VTT - Subtitles

Installation Extras

Command Includes
pip install docling Core + RapidOCR
pip install "docling[vlm]" + VLM pipeline support
pip install "docling[easyocr]" + EasyOCR engine
pip install "docling[tesserocr]" + Tesseract binding
pip install "docling[ocrmac]" + macOS native OCR
pip install "docling[asr]" + Audio speech recognition
pip install "docling[cuda]" + NVIDIA CUDA support
pip install "docling[mac_intel]" + Intel Mac (PyTorch 2.2.2)

Combine extras: pip install "docling[vlm,easyocr]"