TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Conversational Web Navigation/WebLINX

Conversational Web Navigation on WebLINX

Metric: Overall score (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Overall score▼Extra DataPaperDate↕Code
1Llama-2-13B25.21NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
2S-LLaMA-2.7B25.02NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
3Llama-2-7B24.57NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
4Flan-T5-3B23.77NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
5S-LLaMA-1.3B23.73NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
6GPT-3.5F21.22NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
7MindAct-3B20.94NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
8Fuyu-8B19.97NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
9Flan-T5-780M17.27NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
10Pix2Act-1.3B16.88NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
11MindAct-780M15.13NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
12Flan-T5-250M14.99NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
13MindAct-250M12.63NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
14Pix2Act-282M12.51NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
15GPT-4T (Zero-Shot)10.72NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
16GPT-4V (Zero-Shot)10.45NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
17GPT-3.5T (Zero-Shot)8.51NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code