Code Lingua

Introduced 2023-08-06

Code Lingua is a benchmark to compare the ability of language models to understand what the code implements in the source language and translate the same semantics in the target language. It comprises 1,700 code samples across five programming languages, over 10,000 tests, 43,000 translated code snippets, 1,748 manually labeled bugs, and 1,365 bug-fix pairs.