Llama-3.3-70B-Instruct.

meta-llamaopen-source70.6B params

§ 01 · Benchmarks

Every benchmark Llama-3.3-70B-Instruct has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	belebele	92.6%	#4/490	—	source ↗
02	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	average	66.4%	#8/491	—	source ↗
03	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	dyk	71.5%	#8/489	—	source ↗
04	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	polemo2-in	87.1%	#12/490	—	source ↗
05	CPTU-Bench	Natural Language Processing · Polish Text Understanding	sentiment	4.3%	#13/93	—	source ↗
06	Polish EQ-Bench	Natural Language Processing · Polish Emotional Intelligence	eq-score	70.7%	#18/101	—	source ↗
07	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	ppc	79.9%	#19/490	—	source ↗
08	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	eq-bench	61.8%	#21/299	—	source ↗
09	HumanEval	Computer Code · Code Generation	pass@1	88.4%	#22/42	2024-12-01	source ↗
10	CPTU-Bench	Natural Language Processing · Polish Text Understanding	language-understanding	3.9%	#28/93	—	source ↗
11	CPTU-Bench	Natural Language Processing · Polish Text Understanding	tricky-questions	3.4%	#30/93	—	source ↗
12	CPTU-Bench	Natural Language Processing · Polish Text Understanding	average	3.6%	#31/93	—	source ↗
13	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	cbd	36.7%	#33/490	—	source ↗
14	CPTU-Bench	Natural Language Processing · Polish Text Understanding	phraseology	3.0%	#62/93	—	source ↗
15	Open PL LLM Leaderboard	Natural Language Processing · Polish LLM General	polqa-open-book	88.6%	#92/489	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 02 · Strengths by area