opencl-llmperf

TabularTextsCC BYIntroduced 2024-08-19

A collection of datasets and benchmarks for large-scale Performance Modeling with LLMs.

This collection includes these dataset:

  1. github-200K: a first version dataset that contains ~1300 kernel's execution time with input size - global size correlation and imbalanced data.
  2. github-350k: a second version dataset continued from the first that fix the imbalanced data problem.
  3. github-400k: a third version dataset continued from the second that relax the input - global size correlation.
  4. github-600K: a fourth version dataset that contains ~6000 kernel's execution time with input size - global size correlation and have balanced data.
  5. benchmark-[]: benchmarks for LLMs performance on Perfomance Modeling task.