Large Language Model on Noun Phrase Canonicalization

Metric: Base Dataset (higher is better)

LeaderboardDataset