Adam Santoro, David Raposo, David G. T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap
Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn. In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning. We tested RN-augmented networks on three tasks: visual question answering using a challenging dataset called CLEVR, on which we achieve state-of-the-art, super-human performance; text-based question answering using the bAbI suite of tasks; and complex reasoning about dynamic physical systems. Then, using a curated dataset called Sort-of-CLEVR we show that powerful convolutional networks do not have a general capacity to solve relational questions, but can gain this capacity when augmented with RNs. Our work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Visual Question Answering (VQA) | CLEVR | Accuracy | 95.5 | CNN + LSTM + RN |
| Image Retrieval with Multi-Modal Query | Fashion200k | Recall@1 | 13 | Relationship |
| Image Retrieval with Multi-Modal Query | Fashion200k | Recall@10 | 40.5 | Relationship |
| Image Retrieval with Multi-Modal Query | Fashion200k | Recall@50 | 62.4 | Relationship |