TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Audio-visual Question Answering/MUSIC-AVQA

Audio-visual Question Answering on MUSIC-AVQA

Metric: Acc (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Acc▼Extra DataPaperDate↕Code
1VAST80.7YesVAST: A Vision-Audio-Subtitle-Text Omni-Modality...2023-05-29Code
2CoQo(Internvideo2)79.6No---
3VALOR78.9YesVALOR: Vision-Audio-Language Omni-Perception Pre...2023-04-17Code
4CAD78.26NoCAD -- Contextual Multi-modal Alignment for Dyna...2023-10-25-
5LAVISH77.08NoVision Transformers are Parameter-Efficient Audi...2022-12-15Code
6ST-AVQA71.52NoLearning to Answer Questions in Dynamic Audio-Vi...2022-03-26Code