Simpson’s paradox, or the Yule–Simpson effect, is a paradox in probability and statistics, in which a trend appears in different groups of data but disappears or reverses when these groups are combined. It is sometimes given the impersonal title reversal paradox or amalgamation paradox.
This result is often encountered in social-science and medical-science statistics, and is particularly confounding when frequency data is unduly given causal interpretations.The paradoxical elements disappear when causal relations are brought into consideration. Many statisticians believe that the mainstream public should be informed of the counter-intuitive results in statistics such as Simpson’s paradox.
Edward H. Simpson first described this phenomenon in a technical paper in 1951, but the statisticians Karl Pearson, et al., in 1899, and Udny Yule, in 1903, had mentioned similar effects earlier.The name Simpson’s paradox was introduced by Colin R. Blyth in 1972.
Simpson’s paradox for quantitative data: a positive trend appears for two separate groups (blue and red), whereas a negative trend (black, dashed) appears when the groups are combined.