Does an admission test discriminate against women?
This question was asked by the Bucerius Law School in Hamburg, which has been using a law school aptitude test in the selection process for many years. They commissioned us to conduct a fairness analysis.
For a comprehensive fairness analysis, we analyzed anonymized final grades in the LL.B. of 579 students and calculated fairness according to Cleary (1968) and Lawshe (1983). Both models assume that fairness exists if individuals from two groups with the same final grades also have the same test scores. This can be tested statistically in different ways, e.g. by a comparison of regression lines or by differences between standardised predictor and criterion scores for subgroups.
The result was that both the selection test and the A-level grades were fair. Moreover, both had good predictive power for final grades. However, we looked not only at the overall results but also at the 5 modules of the test. Two modules did not make a significant additional contribution to prediction beyond the other three modules and also underestimated women’s academic success: slightly, but significantly in at least one of the two statistical calculations.
The results led to a modification of the selection test: we excluded the modules “Conclusions” and “Headings” and replaced them with the module “Language Styles”. We retained the three best modules “Cases and Norms”, “Circumstantial Evidence” and “Graphs and Tables”.
In my eyes, this is the best way to establish fairness: identify and improve critical components of a test based on an empirical analysis of fairness and validity.
