Abstract
While the importance of personalized policymaking is widely recognized, fully personalized implementation remains rare in practice. We study the problem of policy targeting for a regret-averse planner when training data gives a rich set of observable characteristics while the assignment rules can only depend on its subset. Grounded in decision theory, our regret-averse criterion reflects a planner's concern about regret inequality across the population, which generally leads to a fractional optimal rule due to treatment effect heterogeneity beyond the average treatment effects conditional on the subset characteristics. We propose a debiased empirical risk minimization approach to learn the optimal rule from data. Viewing our debiased criterion as a weighted least squares problem, we establish new upper and lower bounds for the excess risk, indicating a convergence rate of 1/n and asymptotic efficiency in certain cases. We apply our approach to the National JTPA Study and the International Stroke Trial.