TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/NoW

NoW

Noise of Web

ImagesTextscc-by-nc-4.0Introduced 2024-08-02

Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark for robust image-text matching/retrieval models. It contains 100K image-text pairs consisting of website pages and multilingual website meta-descriptions (98,000 pairs for training, 1,000 for validation, and 1,000 for testing). NoW has two main characteristics: without human annotations and the noisy pairs are naturally captured. The source image data of NoW is obtained by taking screenshots when accessing web pages on mobile user interface (MUI) with 720 ×\times× 1280 resolution, and we parse the meta-description field in the HTML source code as the captions. In NCR (predecessor of NCL), each image in all datasets were preprocessed using Faster-RCNN detector provided by Bottom-up Attention Model to generate 36 region proposals, and each proposal was encoded as a 2048-dimensional feature. Thus, following NCR, we release our the features instead of raw images for fair comparison. However, we can not just use detection methods like Faster-RCNN to extract image features since it is trained on real-world animals and objects on MS-COCO. To tackle this, we adapt APT as the detection model since it is trained on MUI data. Then, we capture the 768-dimensional features of top 36 objects for one image. Due to the automated and non-human curated data collection process, the noise in NoW is highly authentic and intrinsic. The estimated noise ratio of this dataset is nearly 70%.

Related Benchmarks

NoW Benchmark/3D/Mean Reconstruction Error (mm)NoW Benchmark/3D/Median Reconstruction ErrorNoW Benchmark/3D/Stdev Reconstruction Error (mm)NoW Benchmark/3D Face Modelling/Mean Reconstruction Error (mm)NoW Benchmark/3D Face Modelling/Median Reconstruction ErrorNoW Benchmark/3D Face Modelling/Stdev Reconstruction Error (mm)NoW Benchmark/3D Face Reconstruction/Mean Reconstruction Error (mm)NoW Benchmark/3D Face Reconstruction/Median Reconstruction ErrorNoW Benchmark/3D Face Reconstruction/Stdev Reconstruction Error (mm)NoW Benchmark/Face Reconstruction/Mean Reconstruction Error (mm)NoW Benchmark/Face Reconstruction/Median Reconstruction ErrorNoW Benchmark/Face Reconstruction/Stdev Reconstruction Error (mm)NoW Benchmark/Facial Recognition and Modelling/Mean Reconstruction Error (mm)NoW Benchmark/Facial Recognition and Modelling/Median Reconstruction ErrorNoW Benchmark/Facial Recognition and Modelling/Stdev Reconstruction Error (mm)Now You're Cooking!/Recipe Generation/Perplexity

Statistics

Papers
2
Benchmarks
0

Links

Homepage

Tasks

Cross-modal retrieval with noisy correspondenceImage-text Retrieval