Description
ENIGMA is an evaluation framework for dialog systems based on Pearson and Spearman's rank correlations between the estimated rewards and the true rewards. ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation, making automatic evaluations feasible. More importantly, ENIGMA is model-free and agnostic to the behavior policies for collecting the experience data (see details in Section 2), which significantly alleviates the technical difficulties of modeling complex dialogue environments and human behaviors.
Papers Using This Method
MizAR 60 for Mizar 502023-03-12Multi-site benchmark classification of major depressive disorder using machine learning on cortical and subcortical measures2022-06-16The Isabelle ENIGMA2022-05-04Learning Theorem Proving Components2021-07-21Fast and Slow Enigmas and Parental Guidance2021-07-14Improving ENIGMA-Style Clause Selection While Learning From History2021-02-26Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach2021-02-20