TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Conversational Web Navigation/WebLINX

Conversational Web Navigation on WebLINX

Metric: Intent Match (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Intent Match▼Extra DataPaperDate↕Code
1S-LLaMA-2.7B84NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
2S-LLaMA-1.3B83.32NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
3Llama-2-7B82.64NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
4Llama-2-13B81.91NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
5Pix2Act-1.3B81.8NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
6Flan-T5-3B81.14NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
7Fuyu-8B80.07NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
8Flan-T5-780M80.02NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
9MindAct-3B79.89NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
10Pix2Act-282M79.71NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
11Flan-T5-250M79.69NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
12GPT-3.5F77.56NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
13MindAct-780M75.87NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
14MindAct-250M74.25NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
15GPT-3.5T (Zero-Shot)42.77NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
16GPT-4V (Zero-Shot)42.36NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code
17GPT-4T (Zero-Shot)41.66NoWebLINX: Real-World Website Navigation with Mult...2024-02-08Code