Several metrics have been proposed in the literature in order to assess ML model’s fairness. Here we recall some of the most used ones.
Disparate Impact (DI) is rooted in the desire for different sensitive demographic groups to experience similar rates of positive decision outcomes (). Given the ML model, represents the predicted class. It compares two groups of the population based on a sensitive feature: the privileged () and the unprivileged () groups. For instance, if we consider race as sensitive feature, white people can be assigned as privileged and non-white people as unprivileged group.
Equal Opportunity (EO) proposes different sensitive groups to achieve similar rates of error in decision outcomes. It is computed as the difference in recall scores (, where is true positive and is false negative for a particular group ) between the unprivileged and privileged groups.
Demographic Parity (DP) the difference in the predicted positive rates between the unprivileged and privileged groups.
Equal Accuracy (EA) the difference in accuracy score (, where is true negative of a particular group ) between unprivileged and privileged groups.
Predictive Equality (PE) which is defined as the difference in false positive rates (, where is false positive for a particular group ) between unprivileged and privileged groups.