
fMRI images of a brain during a memory task (not from the current study).
Walter Reed National Military Medical Center
It is no exaggeration to say that functional MRI has revolutionized the field of neuroscience. Neuroscientists use MRI machines to pick up on changes in blood flow that occur when different parts of the brain become more or less active. This allows them to non-invasively figure out which parts of the brain are used in performing various tasks, from playing economic games to reading words.
But the approach and its users have had their fair share of critics, including some who are concerned about overhyped claims about our ability to read minds. Others point out that improper analysis of fMRI data can produce misleading results, such as finding areas of brain activity in a dead salmon. While that was the result of poor statistical techniques, a new study in PNAS suggests the problem runs considerably deeper, with some of the basic algorithms involved in fMRI analysis producing false positive “signals” at an alarming rate.
The principle behind fMRI is quite simple: neural activity costs energy, which must then be replenished. This means increased blood flow to areas that have been active recently. That blood flow can be picked up using a high-resolution MRI machine, allowing researchers to identify structures in the brain that become active when certain tasks are performed.
But the actual implementation is rather complex. The imaging divides the brain into small volume units called voxels and records activity in each of them individually. Since these voxels are incredibly small, software has to loop through and look for clustering — groups of adjacent voxels that behave similarly. The dead salmon results came because this software was not configured by default to handle the sheer number of voxels imaged by today’s MRI machines. That meant that even at 95 percent confidence, false positives were inevitable.
The new work, conducted by a group of Swedish researchers, suggests the software has other problems as well. The researchers took advantage of a recent trend of opening up data to anyone to use or analyze. They were able to download hundreds of fMRI scans used in other studies to perform their analysis.
Their focus was on resting brain scans, usually used as controls in studies of specific activity. While these may show specific activities (such as moving a leg or thinking about food) in some subjects, there may not be a consistent, systemic signal across a population of people being scanned.
The authors started with a large collection of what were essentially controls, randomly selected a few to become controls again, and then randomly selected others to form an “experimental” population. They repeated this thousands of times and entered the data into one of three software packages. The process was repeated with slightly different parameters to see how it affected the outcome.
The results weren’t good news for fMRI users. “In summary,” the authors conclude, “we find that all three packages have conservative voxelwise inference and invalid clusterwise inference.” In other words, while they are probably warning signs in determining whether a particular voxel shows activity, the cluster identification algorithms often assign activity to a region when none is likely to be present. How often? Up to 70 percent of the time, depending on the algorithm and parameters used.
For the record, a bug came up during these tests that has been in the code for 15 years. The bug fix reduced false positives by more than 10 percent. While it’s good that it’s been fixed, it’s a shame that all those studies were published with the erroneous version.
The authors also found that some brain regions were more likely to have problems with false positives, possibly because of assumptions the algorithms make about underlying brain morphology.
Is this really as bad as it sounds? The authors think so. “This calls into question the validity of numerous published fMRI studies based on parametric cluster-wise inferences.” It’s not clear how many there are, but they’re likely to be a remarkable fraction of the total number of studies using fMRI, which the authors estimate at 40,000.
The authors note that with current open data practices, it would be easy for anyone to go back and reanalyze the original work with the new caveat in mind. But most of the data behind the already published literature is not available, so there really isn’t much to do here except to be extra careful in the future.
PNAS2016. DOI: 10.1073/pnas.1602413113 (About DOIs).