Audio Captioning

Generating text descriptions of audio content.

1
Datasets
5
Results
spider
Canonical metric
Canonical Benchmark

AudioCaps

Audio generation quality evaluated on AudioCaps captions

Primary metric: spider
View full leaderboard

Top 10

Leading models on AudioCaps.

RankModelfadYearSource
1
AudioLDM
4.482023paper
2
AudioLDM 2-Full-Large
1.862024paper
3
AudioLDM 2-Full
1.782024paper
4
TANGO
1.732023paper
5
AudioLDM 2-AC-Large
1.422024paper

All datasets

1 dataset tracked for this task.

Related tasks

Other tasks in Audio.