TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Conversational Web Navigation/WebLINX

Conversational Web Navigation on WebLINX

Metric: Element (IoU) (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Element (IoU)▼Extra DataPaperDate↕Code
1Llama-2-13B22.82NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
2S-LLaMA-2.7B22.6NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
3Llama-2-7B22.26NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
4S-LLaMA-1.3B20.54NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
5Flan-T5-3B20.31NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
6GPT-3.5F18.64NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
7MindAct-3B16.5NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
8Fuyu-8B15.7NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
9Flan-T5-780M15.36NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
10Flan-T5-250M14.86NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
11MindAct-780M13.39NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
12MindAct-250M12.05NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
13GPT-4V (Zero-Shot)10.91NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
14GPT-4T (Zero-Shot)10.85NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
15GPT-3.5T (Zero-Shot)8.62NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
16Pix2Act-1.3B8.28NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
17Pix2Act-282M6.2NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code