TIME
\textsc{TimE}: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario
TextsCC BYIntroduced 2025-05-19
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Provide:
- a high-level explanation of the dataset characteristics
- explain motivations and summary of its content
- potential use cases of the dataset
Related Benchmarks
Time Series Prediction Benchmarks/2D Semantic Segmentation/1:3 AccuracyTimeBank/Information Extraction/F1 scoreTimeBank/Temporal Information Extraction/F1 scoreTimeBank/Temporal Processing/F1 scoreTimeBank/Temporal Processing/F1-ScoreTimeBankPT/Information Extraction/F1TimeBankPT/Temporal Information Extraction/F1TimeBankPT/Temporal Processing/F1TimeQuestions/Question Answering/P@1TimeTravel/Music Auto-Tagging/0..5secTimers and Such/Dialogue/Accuracy (%)Timers and Such/Dialogue Understanding/Accuracy (%)Timers and Such/Spoken Language Understanding/Accuracy (%)