Audio Classification.

Classification of audio signals into predefined categories such as music genres, environmental sounds, or speaker identification.

Datasets

Results

map

Canonical metric

§ 02 · Canonical benchmark

The reference dataset.

AudioSet

2M+ human-labeled 10-second YouTube video clips covering 632 audio event classes.

Primary metric: map

§ 03 · Top 10

Leading models on AudioSet.

#	Model	map	Year	Source
★	AST (Ensemble-M)	0.485	2021	paper ↗

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

6 datasets tracked for this task.

§ 05 · Related tasks

Reply within 48 hours · No newsletter

Still looking for something on Audio Classification? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.