Multimodal Text and Image Classification on Food-101

Metric: Accuracy (%) (higher is better)

LeaderboardDataset
#ModelAccuracy (%)Extra DataPaperDateCode
1Early Fusion (Bert + InceptionV3)92.5No--Code
2Late Fusion (Bert + InceptionV3)84.59No--Code