Three groups
If the study includes only three groups, as is often the case, there is a much simpler procedure that does not even rely on assumptions concerning distribution or group size: First, the global p-value is calculated for the null hypotheses that all three groups are identical. Second, an unadjusted p-value is calculated separately for each of the three pairwise comparisons. Finally, each of these three p-values is adjusted by replacing it with the global p-value if the global p-value is higher. This is illustrated in the example below. This procedure always controls for the family-wise error rate (3), but many researchers seem to be unaware of this fact. Even when the data are normally distributed and Tukey's test could be used, this simple method will give a statistical power that is at least as high as Tukey's test for three groups (4).
If the data are normally distributed, we can estimate the global p-value in a one-way analysis of variance and then make pairwise comparisons with t-tests. If non-parametric methods are used, we can first perform a global Kruskal-Wallis test, followed by pairwise Wilcoxon-Mann-Whitney tests. And, if the data are categorical, we can first perform Pearson's chi-squared test for three groups and then Pearson's chi-squared tests for each of the three pairwise comparisons.
Let us illustrate this with an example: Weider and colleagues compared the cognitive function in three groups of people – 41 with anorexia, 40 with bulimia and 40 healthy control persons (5), Table (3). Wechsler's intelligence scale (5) showed an average score (standard deviation) of 10.51 (3.26), 10.00 (2.42) and 11.85 (2.83) in the three groups respectively. The global p-value in the one-way analysis of variance was 0.014. The p-values for pairwise comparisons by some alternative methods are shown in Table 1. We see that by using this method, both the anorexia group and the bulimia group stand out as significantly different from the control group with a significance level of 5 %. If Tukey's or Dunnett's test had been used, only the difference between the bulimia group and the control group would have been significant.
Table 1
Pairwise comparisons for Wechsler's intelligence scale between persons with anorexia (A), bulimia (B) and healthy control persons (K) (based on data from (5), Table 3). The global p-value from a one-way analysis of variance was 0.014. Unadjusted p-values were estimated by LSD (least significant difference), which is a generalisation of the t-test.
| | Unadjusted p-value | Adjusted p-value |
|---|
| Pair | LSD | Tukey | Dunnett | Maximum of global and unadjusted |
|---|
| A-B | 0.422 | 0.701 | | 0.422 |
| A-K | 0.038 | 0.094 | 0.069 | 0.038 |
| B-K | 0.005 | 0.013 | 0.009 | 0.014 |