Polish Text Understanding

Evaluating language models on understanding Polish text: sentiment, implicatures, phraseology, tricky questions, and hallucination resistance.

1
Datasets
0
Results
average
Canonical metric
Canonical Benchmark

CPTU-Bench

Evaluates LLMs on understanding Polish text across 4 dimensions: sentiment analysis, language understanding (implicatures, author intent), phraseology (idioms, phraseological compounds), and tricky questions (logic, ambiguity, hallucination resistance). Score range 0-5 per category. 378 hand-written examples. Created by SpeakLeash/Spichlerz.

Primary metric: average
View full leaderboard

Top 10

Leading models on CPTU-Bench.

No results yet. Be the first to contribute.

All datasets

1 dataset tracked for this task.

Related tasks

Other tasks in Natural Language Processing.