TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Deep Kernels for Non-Parametric Two-Sample Tests

Learning Deep Kernels for Non-Parametric Two-Sample Tests

Feng Liu, Wenkai Xu, Jie Lu, Guangquan Zhang, Arthur Gretton, Danica J. Sutherland

2020-02-21ICML 2020 1Two-sample testingVocal Bursts Valence Prediction
PaperPDFCode(official)

Abstract

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power. These tests adapt to variations in distribution smoothness and shape over space, and are especially suited to high dimensions and complex data. By contrast, the simpler kernels used in prior kernel testing work are spatially homogeneous, and adaptive only in lengthscale. We explain how this scheme includes popular classifier-based two-sample tests as a special case, but improves on them in general. We provide the first proof of consistency for the proposed adaptation method, which applies both to kernels on deep features and to simpler radial basis kernels or multiple kernel learning. In experiments, we establish the superior performance of our deep kernels in hypothesis testing on benchmark and real-world data. The code of our deep-kernel-based two sample tests is available at https://github.com/fengliu90/DK-for-TST.

Results

TaskDatasetMetricValueModel
Two-sample testingHDGM (d=10, N=4000)Avg accuracy65.9MMD-D
Two-sample testingHIGGS Data SetAvg accuracy57.9MMD-D
Two-sample testingCIFAR-10 vs CIFAR-10.1 (1000 samples)Avg accuracy74.4MMD-D
Two-sample testingMNIST vs Fake MNISTAvg accuracy91MMD-D
Two-sample testingBlob (9 modes, 40 for each)Avg accuracy98.5MMD-D

Related Papers

Leveraging Optimal Transport for Distributed Two-Sample Testing: An Integrated Transportation Distance-based Framework2025-06-19Signature Maximum Mean Discrepancy Two-Sample Statistical Tests2025-06-02From Two Sample Testing to Singular Gaussian Discrimination2025-05-07Advanced Tutorial: Label-Efficient Two-Sample Tests2025-01-07Optimal Algorithms for Augmented Testing of Discrete Distributions2024-12-01A Unified Data Representation Learning for Non-parametric Two-sample Testing2024-11-30Minimax Optimal Two-Sample Testing under Local Differential Privacy2024-11-13Model Equality Testing: Which Model Is This API Serving?2024-10-26