TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Disentangling Multiple Features in Video Sequences using G...

Disentangling Multiple Features in Video Sequences using Gaussian Processes in Variational Autoencoders

Sarthak Bhagat, Shagun Uppal, Zhuyun Yin, Nengli Lim

2020-01-08ECCV 2020 8Video PredictionGaussian Processes
PaperPDFCode(official)

Abstract

We introduce MGP-VAE (Multi-disentangled-features Gaussian Processes Variational AutoEncoder), a variational autoencoder which uses Gaussian processes (GP) to model the latent space for the unsupervised learning of disentangled representations in video sequences. We improve upon previous work by establishing a framework by which multiple features, static or dynamic, can be disentangled. Specifically we use fractional Brownian motions (fBM) and Brownian bridges (BB) to enforce an inter-frame correlation structure in each independent channel, and show that varying this structure enables one to capture different factors of variation in the data. We demonstrate the quality of our representations with experiments on three publicly available datasets, and also quantify the improvement using a video prediction task. Moreover, we introduce a novel geodesic loss function which takes into account the curvature of the data manifold to improve learning. Our experiments show that the combination of the improved representations with the novel loss function enable MGP-VAE to outperform the baselines in video prediction.

Results

TaskDatasetMetricValueModel
VideoSpritesMSE61.6MGP-VAE (with geodesic loss)
VideoColored dSpritesMSE4.5MGP-VAE (with geodesic loss)
Video PredictionSpritesMSE61.6MGP-VAE (with geodesic loss)
Video PredictionColored dSpritesMSE4.5MGP-VAE (with geodesic loss)

Related Papers

Fast Gaussian Processes under Monotonicity Constraints2025-07-09MathOptAI.jl: Embed trained machine learning predictors into JuMP models2025-07-03Epona: Autoregressive Diffusion World Model for Autonomous Driving2025-06-30Whole-Body Conditioned Egocentric Video Prediction2025-06-26MinD: Unified Visual Imagination and Control via Hierarchical World Models2025-06-23Scalable Machine Learning Algorithms using Path Signatures2025-06-21Gaussian Processes and Reproducing Kernels: Connections and Equivalences2025-06-20AMPLIFY: Actionless Motion Priors for Robot Learning from Videos2025-06-17