Codesota · OCR · Rys OCRHome/OCR/Rys OCR

Polish SOTA · Open Source · R&D

Rys OCR.

State-of-the-art Polish text recognition. Fine-tuned for correct handling of Polish diacritics (a, c, e, l, n, o, s, z, z). First release for ongoing R&D.

View on HuggingFace →Contribute

§ 01 · First fine-tune results

The headline numbers.

71.3%

CER Reduction

5.58% to 1.60%

46.1%

WER Reduction

13.37% to 7.21%

10k

Training Images

Synthetic Polish documents

§ 02 · Technical details

How it's built.

Model Architecture

Base Model: PaddleOCR-VL
Parent Base: ERNIE-4.5-0.3B
Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 16
LoRA Alpha: 32
Target Modules: q_proj, k_proj, v_proj, o_proj
VRAM Required: 4-6 GB
License: Apache 2.0

Training Data

10,000 synthetic Polish document images across 7 categories:

AddressesInvoice linesReceipt linesDatesNamesPricesPhrases

Training: 1 epoch, AdamW optimizer, linear LR schedule

Framework: PEFT 0.18.0 + Transformers

§ 03 · Benchmark results

Baseline vs fine-tuned.

Metric	Baseline	Fine-tuned	Improvement
Character Error Rate (CER)	5.58%	1.60%	v 71.3%
Word Error Rate (WER)	13.37%	7.21%	v 46.1%
Exact Match	74%	76%	^ 2%

Key improvement: Resolved Polish diacritic confusion (l, e, s, etc.)

§ 04 · Quick start

Inference in Python.

python

from transformers import AutoModelForCausalLM, AutoProcessor
from peft import PeftModel
from PIL import Image

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "PaddlePaddle/PaddleOCR-VL",
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "anon13370/RysOCR")

processor = AutoProcessor.from_pretrained(
    "anon13370/RysOCR",
    trust_remote_code=True
)

# Run inference
image = Image.open("your_document.png")
prompt = "OCR: "

inputs = processor(images=image, text=prompt, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}

outputs = model.generate(**inputs, max_new_tokens=256)
text = processor.decode(outputs[0], skip_special_tokens=True)
print(text)

§ 05 · Contribute

Help build Polish OCR SOTA.

This is the first fine-tune in ongoing R&D. We need your help to push Polish OCR to the next level.

Datasets

Contribute Datasets

Real Polish documents needed: invoices, receipts, historical documents, handwritten notes, street signs.

Scanned documents with ground truth
Photos of Polish text in the wild
Historical Polish manuscripts
Specialized domain texts (medical, legal)

Submit Dataset

Benchmarks

Run Benchmarks

Help us evaluate Rys OCR on more Polish-specific benchmarks and compare with other models.

Polish document benchmarks
Diacritic-specific test sets
Cross-model comparisons
Domain-specific evaluations

Submit Results

R&D

Join R&D

Collaborate on next iterations: architecture experiments, training strategies, deployment optimization.

Model architecture research
Training pipeline improvements
Edge deployment optimization
Multi-language expansion

Get Involved

§ 06 · Roadmap

What's next.

✓

v0.1 - First Fine-Tune

10k synthetic images, LoRA on PaddleOCR-VL. 71% CER reduction.

v0.2 - Real Data

Train on real Polish documents. Expand domain coverage.

v0.3 - Handwriting

Add handwritten Polish text recognition capability.

v1.0 - Production Ready

Full benchmark coverage, optimized inference, API deployment.

§ 07 · Limitations

Known limitations.

Optimized for printed Polish text; handwritten recognition may vary
Best results on clean document scans
Requires loading both base model and LoRA weights for inference
Trained on synthetic data only (v0.1)

§ 08 · Related

Continue reading.

Compare with Other Models

Tutorials & Guides

← Back to OCR Models