Fairness Metrics

Several metrics have been proposed in the literature in order to assess ML model’s fairness. Here we recall some of the most used ones.

Disparate Impact (DI) is rooted in the desire for different sensitive demographic groups to experience similar rates of positive decision outcomes ( $\hat{y} = pos$ ). Given the ML model, $\hat{y}$ represents the predicted class. It compares two groups of the population based on a sensitive feature: the privileged ( $priv$ ) and the unprivileged ( $unp$ ) groups. For instance, if we consider race as sensitive feature, white people can be assigned as privileged and non-white people as unprivileged group.

$DI=\frac{P(\hat{y}=pos|D=unp)}{P(\hat{y}=pos|D=priv)}$

Equal Opportunity (EO) proposes different sensitive groups to achieve similar rates of error in decision outcomes. It is computed as the difference in recall scores ( $\frac{TP_i}{TP_i+FN_i}$ , where $TP_i$ is true positive and $FN_i$ is false negative for a particular group $i$ ) between the unprivileged and privileged groups.

$EO=\frac{TP_{unp}}{TP_{unp}+FN_{unp}}-\frac{TP_{priv}}{TP_{priv}+FN_{priv}}$

Demographic Parity (DP) the difference in the predicted positive rates between the unprivileged and privileged groups.

$DP=P(\hat{y}=pos|D=unp)-P(\hat{y}=pos|D=priv)$

Equal Accuracy (EA) the difference in accuracy score ( $\frac{TP_i+TN_i}{P_i+N_i}$ , where $TN_i$ is true negative of a particular group $i$ ) between unprivileged and privileged groups.

$EA=\frac{TP_{unp}+TN_{unp}}{P_{unp}+N_{unp}}-\frac{TP_{priv}+TN_{priv}}{P_{priv}+N_{priv}}$

Predictive Equality (PE) which is defined as the difference in false positive rates ( $\frac{FP_i}{FP_i+TP_i}$ , where $FP_i$ is false positive for a particular group $i$ ) between unprivileged and privileged groups.

$PE=\frac{FP_{unp}}{FP_{unp}+TP_{unp}}-\frac{FP_{priv}}{FP_{priv}+TP_{priv}}. \end{equation}$