CodeSOTA Polish

Unknown

1,000 synthetic and real Polish text images with 5 degradation levels (clean to severe). Tests character-level OCR on diacritics with contamination-resistant synthetic categories. Categories: synth_random (pure character recognition), synth_words (Markov-generated words), real_corpus (Pan Tadeusz, official documents), wikipedia (potential contamination baseline).

Benchmark Stats

Models0
Papers0
Metrics0

SOTA History

Not enough data to show trend.

No results yet on this benchmark

Help build the community leaderboard — submit your model results.

No benchmark results available yet for CodeSOTA Polish.

Check back soon as we continue collecting data.

Submit a Result