TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/CoVA

CoVA

Context-aware Visual Attention-based (CoVA) webpage object detection pipeline

Computer VisionIntroduced 20002 papers
Source Paper

Description

Context-Aware Visual Attention-based end-to-end pipeline for Webpage Object Detection (CoVA) aims to learn function f to predict labels y = [y1,y2,...,yNy_1, y_2, ..., y_Ny1​,y2​,...,yN​] for a webpage containing N elements. The input to CoVA consists of:

  1. a screenshot of a webpage,
  2. list of bounding boxes [x, y, w, h] of the web elements, and
  3. neighborhood information for each element obtained from the DOM tree.

This information is processed in four stages:

  1. the graph representation extraction for the webpage,
  2. the Representation Network (RN),
  3. the Graph Attention Network (GAT), and
  4. a fully connected (FC) layer.

The graph representation extraction computes for every web element i its set of K neighboring web elements NiN_iNi​. The RN consists of a Convolutional Neural Net (CNN) and a positional encoder aimed to learn a visual representation viv_ivi​ for each web element i ∈ {1, ..., N}. The GAT combines the visual representation viv_ivi​ of the web element i to be classified and those of its neighbors, i.e., vkv_kvk​ ∀k ∈ NiN_iNi​ to compute the contextual representation cic_ici​ for web element i. Finally, the visual and contextual representations of the web element are concatenated and passed through the FC layer to obtain the classification output.

Papers Using This Method

CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics2022-07-02CoVA: Context-aware Visual Attention for Webpage Information Extraction2021-10-24