Statistics lessons from a salmon

    ()

    sporsmal_grey_rgb
    Article

    Craig Bennett could not believe his eyes. In his hand, he held an image that showed the brain activity of a salmon, where three statistically significant dots shone back at him – a clear sign that they had made a pioneering discovery in the relationship between salmon and humans. Or else the statistics were wrong.

    In the lab where Bennett was working, they were investigating human decision-making with the aid of functional magnetic-resonance imaging (fMRI), a method that scans the brain to measure changes in blood oxygenation levels (1). First, however, the equipment had to be calibrated, and the young academics were challenging each other to come up with amusing things to scan (2). A pumpkin. A plucked chicken. A salmon.

    With the salmon placed in the MR scanner, they did not content themselves with using it only to calibrate the equipment—they included it in the entire experiment. The salmon was shown a series of photographs of people in varying social situations and was then asked to identify the emotions experienced by the people in the photos.

    It was one of the images of the salmon's brain activity during this experiment that Bennett was now looking at. The image showed three red, statistically significant dots (Figure 1). Bennett was amazed – for obvious reasons; the salmon they had used in the experiment was dead.

    'Either we have stumbled onto a rather amazing discovery in terms of post-mortem ichthyological cognition, or there is something a bit off with regard to our uncorrected statistical approach', they concluded (3).

    Thousands

    Thousands

    Doubtful statistically significant results are not unknown in fMRI analysis. Such studies produce images that consist of tens of thousands of small individual dots, called voxels, and the analytical task is to see whether any of these voxels are activated when a test subject performs a given mental task.

    But the process is not without errors. Measurement processes rarely are. A voxel can show activity even when there is none. And even though the probability of error is low in an individual voxel, there are so many voxels that it is not unlikely that some of them will show activity erroneously. This gives rise to problems. Statistical problems.

    With tens of thousands of voxels, and tens of thousands of accompanying statistical tests, in every single image, we run into the problem of multiple comparisons. If we perform enough statistical tests, some are guaranteed to come out positive – even when there in reality is nothing to find. And such false-positive test results is something we would like to avoid.

    Multiple testing

    Multiple testing

    There are various methods to correct for multiple comparisons (4). And when Bennett and his colleagues corrected for multiple comparisons, the salmon images showed nothing.

    However, such corrections come at a price: loss of statistical power. You may avoid false- positive results, but you will also run the risk of not finding things that are actually there, so-called false negatives. In the fMRI community, there is an ongoing discussion about what is worse: false-positive or false-negative findings.

    Bennett is clear in his views: there is a greater chance that false positive findings will be hyped up by research institutions and the media, and can lead to problems in the longer term. False positives are worse than false negatives.

    Not only salmon

    Not only salmon

    Multiple comparisons are a challenge not only in fMRI and image analysis. The problem crops up everywhere where many statistical tests are performed simultaneously. In genetics, where many genes are studied in parallel, often in relatively few individuals, correction for multiple comparisons is vigorously debated.

    In clinical trials, the problem is often more hidden. Trials that involve the testing of multiple hypotheses, with many outcome variables in the same individuals, face the same challenge. When enough statistical tests are performed, the chances of finding something increases, and this increased probability of false-positive findings must be dealt with.

    Nobel

    Nobel

    In 2012, Bennett and his colleagues' salmon project won an Ig Nobel Prize – the award given to research projects that first make you laugh, and then make you think.

    The project aimed to highlight the importance of correcting for multiple comparisons, and the project has gradually become famous in scientific circles. Perhaps it has also achieved its desired goal. When the salmon project was first presented at a conference in 2009 (5), 25–40 % of the articles in the field of fMRI reported no correction for multiple comparisons. When Bennett and his colleagues received the Ig Nobel Prize three years later, this number had dropped to 10 % (1).

    But of course, that could be just a coincidence.

    Comments  ( 1 )
    PDF
    Print

    Recent Articles