Grounded Situation Recognition

Sarah Pratt, Mark Yatskar, Luca Weihs, Ali Farhadi, Aniruddha Kembhavi

2020-03-26ECCV 2020 8Grounded Situation Recognition Retrieval Image Retrieval

Abstract

We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with their roles (e.g. agent, tool), and bounding-box groundings of entities. GSR presents important technical challenges: identifying semantic saliency, categorizing and localizing a large and diverse set of entities, overcoming semantic sparsity, and disambiguating roles. Moreover, unlike in captioning, GSR is straightforward to evaluate. To study this new task we create the Situations With Groundings (SWiG) dataset which adds 278,336 bounding-box groundings to the 11,538 entity classes in the imsitu dataset. We propose a Joint Situation Localizer and find that jointly predicting situations and groundings with end-to-end training handily outperforms independent training on the entire grounding metric suite with relative gains between 8% and 32%. Finally, we show initial findings on three exciting future directions enabled by our models: conditional querying, visual chaining, and grounded semantic aware image retrieval. Code and data available at https://prior.allenai.org/projects/gsr.

Results

Task	Dataset	Metric	Value	Model
Situation Recognition	imSitu	Top-1 Verb	39.94	JSL
Situation Recognition	imSitu	Top-1 Verb & Value	31.44	JSL
Situation Recognition	imSitu	Top-5 Verbs	67.6	JSL
Situation Recognition	imSitu	Top-5 Verbs & Value	51.88	JSL
Situation Recognition	imSitu	Top-1 Verb	39.36	ISL
Situation Recognition	imSitu	Top-1 Verb & Value	30.09	ISL
Situation Recognition	imSitu	Top-5 Verbs	65.51	ISL
Situation Recognition	imSitu	Top-5 Verbs & Value	50.16	ISL
Situation Recognition	SWiG	Top-1 Verb	39.94	JSL
Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	24.86	JSL
Situation Recognition	SWiG	Top-1 Verb & Value	31.44	JSL
Situation Recognition	SWiG	Top-5 Verbs	67.6	JSL
Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	40.6	JSL
Situation Recognition	SWiG	Top-5 Verbs & Value	51.88	JSL
Situation Recognition	SWiG	Top-1 Verb	39.36	ISL
Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	22.73	ISL
Situation Recognition	SWiG	Top-1 Verb & Value	30.09	ISL
Situation Recognition	SWiG	Top-5 Verbs	65.51	ISL
Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	36.6	ISL
Situation Recognition	SWiG	Top-5 Verbs & Value	50.16	ISL
Grounded Situation Recognition	SWiG	Top-1 Verb	39.94	JSL
Grounded Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	24.86	JSL
Grounded Situation Recognition	SWiG	Top-1 Verb & Value	31.44	JSL
Grounded Situation Recognition	SWiG	Top-5 Verbs	67.6	JSL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	40.6	JSL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Value	51.88	JSL
Grounded Situation Recognition	SWiG	Top-1 Verb	39.36	ISL
Grounded Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	22.73	ISL
Grounded Situation Recognition	SWiG	Top-1 Verb & Value	30.09	ISL
Grounded Situation Recognition	SWiG	Top-5 Verbs	65.51	ISL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	36.6	ISL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Value	50.16	ISL

Abstract

Results

Task	Dataset	Metric	Value	Model
Situation Recognition	imSitu	Top-1 Verb	39.94	JSL
Situation Recognition	imSitu	Top-1 Verb & Value	31.44	JSL
Situation Recognition	imSitu	Top-5 Verbs	67.6	JSL
Situation Recognition	imSitu	Top-5 Verbs & Value	51.88	JSL
Situation Recognition	imSitu	Top-1 Verb	39.36	ISL
Situation Recognition	imSitu	Top-1 Verb & Value	30.09	ISL
Situation Recognition	imSitu	Top-5 Verbs	65.51	ISL
Situation Recognition	imSitu	Top-5 Verbs & Value	50.16	ISL
Situation Recognition	SWiG	Top-1 Verb	39.94	JSL
Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	24.86	JSL
Situation Recognition	SWiG	Top-1 Verb & Value	31.44	JSL
Situation Recognition	SWiG	Top-5 Verbs	67.6	JSL
Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	40.6	JSL
Situation Recognition	SWiG	Top-5 Verbs & Value	51.88	JSL
Situation Recognition	SWiG	Top-1 Verb	39.36	ISL
Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	22.73	ISL
Situation Recognition	SWiG	Top-1 Verb & Value	30.09	ISL
Situation Recognition	SWiG	Top-5 Verbs	65.51	ISL
Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	36.6	ISL
Situation Recognition	SWiG	Top-5 Verbs & Value	50.16	ISL
Grounded Situation Recognition	SWiG	Top-1 Verb	39.94	JSL
Grounded Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	24.86	JSL
Grounded Situation Recognition	SWiG	Top-1 Verb & Value	31.44	JSL
Grounded Situation Recognition	SWiG	Top-5 Verbs	67.6	JSL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	40.6	JSL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Value	51.88	JSL
Grounded Situation Recognition	SWiG	Top-1 Verb	39.36	ISL
Grounded Situation Recognition	SWiG	Top-1 Verb & Grounded-Value	22.73	ISL
Grounded Situation Recognition	SWiG	Top-1 Verb & Value	30.09	ISL
Grounded Situation Recognition	SWiG	Top-5 Verbs	65.51	ISL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Grounded-Value	36.6	ISL
Grounded Situation Recognition	SWiG	Top-5 Verbs & Value	50.16	ISL

Grounded Situation Recognition

Abstract

Results

Related Papers

Grounded Situation Recognition

Abstract

Results

Related Papers