Let us start with an example. Consider math and biology teachers working in a school. Every year, they train students and send to school final examinations. Some students pass. Some students do not make it. Every year, each teacher gets 'percentage of passes' score.
The pass percentages for each teacher for two years are given below.
Math teacher Biology teacher
Year I 70/80 = 87.5% 20/20 = 100%
Year II 10/20 = 50% 50/80 = 62.5%
Total 80/100 = 80% 70/100 = 70%
Here, the denominator is the number of students appeared and the numerator is the no.of students passed. Every year biology teacher seems to show good performance than the math teacher. But when we look at the total of 2 years, math teacher wins. This is Simpson's paradox.
Simpson's paradox occurs when groups of data show one particular trend, but this trend is reversed when the groups are combined together.
The paradox occurs when group sizes (no. of students, here) are different. Look at the column's of the table.
This Simpson's paradox should always be taken into account to avoid misinterpretation of statistical data.
For example, heart surgery and cataract surgery cannot be compared. Many people go for cataract surgeries and the success rate is high. Where as a few people opt for heart surgeries.
We always get complaints that women are not treated fairly in admissions, appointments etc. But, in some cases, women apply in less numbers, mostly they get opportunities. Hence, women group size and men group size are different and may lead to Simpson's paradox. Actually, an american university is sued for the gender gap arising out of Simpson's paradox.
-------------------------------------------------------------
Comments
Post a Comment