TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Energy-Based Models for Deep Probabilistic Regression

Energy-Based Models for Deep Probabilistic Regression

Fredrik K. Gustafsson, Martin Danelljan, Goutam Bhat, Thomas B. Schön

2019-09-26ECCV 2020 8Visual Object TrackingVisual TrackingregressionPose Estimationobject-detectionHead Pose EstimationObject Detection
PaperPDFCode(official)Code

Abstract

While deep learning-based classification is generally tackled using standardized approaches, a wide variety of techniques are employed for regression. In computer vision, one particularly popular such technique is that of confidence-based regression, which entails predicting a confidence value for each input-target pair (x,y). While this approach has demonstrated impressive results, it requires important task-dependent design choices, and the predicted confidences lack a natural probabilistic meaning. We address these issues by proposing a general and conceptually simple regression method with a clear probabilistic interpretation. In our proposed approach, we create an energy-based model of the conditional target density p(y|x), using a deep neural network to predict the un-normalized density from (x,y). This model of p(y|x) is trained by directly minimizing the associated negative log-likelihood, approximated using Monte Carlo sampling. We perform comprehensive experiments on four computer vision regression tasks. Our approach outperforms direct regression, as well as other probabilistic and confidence-based methods. Notably, our model achieves a 2.2% AP improvement over Faster-RCNN for object detection on the COCO dataset, and sets a new state-of-the-art on visual tracking when applied for bounding box estimation. In contrast to confidence-based methods, our approach is also shown to be directly applicable to more general tasks such as age and head-pose estimation. Code is available at https://github.com/fregu856/ebms_regression.

Results

TaskDatasetMetricValueModel
Object TrackingUAV123AUC0.672ATOM(Resnet18)+EnergyRegression
Object TrackingTrackingNetNormalized Precision80.1ATOM(Resnet18)+EnergyRegression
Object TrackingTrackingNetPrecision69.7ATOM(Resnet18)+EnergyRegression
Object TrackingTrackingNetSuccess Rate74.5ATOM(Resnet18)+EnergyRegression
Visual Object TrackingUAV123AUC0.672ATOM(Resnet18)+EnergyRegression
Visual Object TrackingTrackingNetNormalized Precision80.1ATOM(Resnet18)+EnergyRegression
Visual Object TrackingTrackingNetPrecision69.7ATOM(Resnet18)+EnergyRegression
Visual Object TrackingTrackingNetSuccess Rate74.5ATOM(Resnet18)+EnergyRegression

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17