TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Conversational Web Navigation/WebLINX

Conversational Web Navigation on WebLINX

Metric: Text (F1) (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Text (F1)▼Extra DataPaperDate↕Code
1S-LLaMA-2.7B27.17NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
2Llama-2-13B26.6NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
3Llama-2-7B26.5NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
4S-LLaMA-1.3B25.85NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
5Flan-T5-3B25.75NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
6Pix2Act-1.3B25.21NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
7MindAct-3B23.16NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
8GPT-3.5F22.39NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
9Fuyu-8B22.3NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
10Pix2Act-282M16.4NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
11Flan-T5-780M14.05NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
12MindAct-780M13.58NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
13Flan-T5-250M9.21NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
14MindAct-250M7.67NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
15GPT-4T (Zero-Shot)6.75NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
16GPT-4V (Zero-Shot)6.21NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
17GPT-3.5T (Zero-Shot)3.45NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code