TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Object Placements For Relational Instructions by ...

Learning Object Placements For Relational Instructions by Hallucinating Scene Representations

Oier Mees, Alp Emek, Johan Vertens, Wolfram Burgard

2020-01-23Robotic GraspingScene GenerationSpatial Relation Recognition
PaperPDFCodeCode

Abstract

Robots coexisting with humans in their environment and performing services for them need the ability to interact with them. One particular requirement for such robots is that they are able to understand spatial relations and can place objects in accordance with the spatial relations expressed by their user. In this work, we present a convolutional neural network for estimating pixelwise object placement probabilities for a set of spatial relations from a single input image. During training, our network receives the learning signal by classifying hallucinated high-level scene representations as an auxiliary task. Unlike previous approaches, our method does not require ground truth data for the pixelwise relational probabilities or 3D models of the objects, which significantly expands the applicability in practical applications. Our results obtained using real-world data and human-robot experiments demonstrate the effectiveness of our method in reasoning about the best way to place objects to reproduce a spatial relation. Videos of our experiments can be found at https://youtu.be/zaZkHTWFMKM

Related Papers

World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17MTF-Grasp: A Multi-tier Federated Learning Approach for Robotic Grasping2025-07-14$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting2025-07-12Acquiring and Adapting Priors for Novel Tasks via Neural Meta-Architectures2025-07-07Voyaging into Unbounded Dynamic Scenes from a Single View2025-07-05XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation2025-06-26From 2D to 3D Cognition: A Brief Survey of General World Models2025-06-25WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration2025-06-25