Fmriprep#
Today, many excellent general-purpose, open-source neuroimaging software packages exist: SPM (Matlab-based), FSL, AFNI, and Freesurfer (with a shell interface). We argue that there is not one single package that is always the best choice for every step in your preprocessing pipeline. Fortunately, people from the Poldrack lab created fmriprep, a software package that offers a preprocessing pipeline which “glues together” functionality from different neuroimaging software packages (such as Freesurfer and FSL), such that each step in the pipeline is executed by the software package that (arguably) does it best.
We have been using Fmriprep for preprocessing of our own data and we strongly recommend it. It is relatively simple to use, requires minimal user intervention, and creates extensive visual reports for users to do visual quality control (to check whether each step in the pipeline worked as expected). The only requirement to use Fmriprep is that your data is formatted as specified in the Brain Imaging Data Structure (BIDS).
The BIDS-format#
BIDS is a specification on how to format, name, and organize your MRI dataset. It specifies the file format of MRI files (i.e., compressed Nifti: .nii.gz
files), lays out rules for how you should name your files (i.e., with “key-value” pairs, such as: sub-01_ses-1_task-1back_run-1_bold.nii.gz
), and outlines the file/folder structure of your dataset (where each subject has its own directory with separate subdirectories for different MRI modalities, including fieldmaps, functional, diffusion, and anatomical MRI). Additionally, it specifies a way to include “metadata” about the (MRI) files in your dataset with JSON files: plain-text files with key-value pairs (in the form “parameter: value”). Given that your dataset is BIDS-formatted and contains the necessary metadata, you can use fmriprep
on your dataset. (You can use the awesome bids-validator to see whether your dataset is completely valid according to BIDS.)
There are different tools to convert your “raw” scanner data (e.g., in DICOM or PAR/REC format) to BIDS, including heudiconv, bidscoin, and bidsify (created by Lukas). We’ll skip over this step and assume that you’ll be able to convert your data to BIDS.
Installing Fmriprep#
Now, having your data in BIDS is an important step in getting started with Fmriprep. The next step is installing the package. Technically, Fmriprep is a Python package, so it can be installed as such (using pip install fmriprep
), but we do not recommend this “bare metal” installation, because it depends on a host of neuroimaging software packages (including FSL, Freesurfer, AFNI, and ANTs). So if you’d want to directly install Fmriprep, you’d need to install those extra neuroimaging software packages as well (which is not worth your time, trust us).
Fortunately, Fmriprep also offers a “Docker container” in which Fmriprep and all the associated dependencies are already installed. Docker is software that allows you to create “containers”, which are like lightweight “virtual machines” (VM) that are like a separate (Linux-based) operating system with a specific software configuration. You can download the Fmriprep-specific docker “image”, which is like a “recipe”, build the Fmriprep-specific “container” according to this “recipe” on your computer, and finally use this container to run Fmriprep on your computer as if all dependencies were actually installed on your computer! Docker is available on Linux, Mac, and Windows. To install Docker, google something like “install docker for {Windows,Mac,Linux}” to find a google walkthrough.
Note that you need administrator (“root”) privilege on your computer (which is likely the case for your own computer, but not on shared analysis servers) to run Docker. If you don’t have root access on your computer/server, ask you administrator/sysadmin to install singularity, which allows you to convert Docker images to Singularity images, which you can run without administrator privileges.
Assuming you have installed Docker, you can run the “containerized” Fmriprep from your command line directly, which involves a fairly long and complicated command (i.e., docker run -it --rm -v bids_dir /data ... etc
), or using the fmriprep-docker
Python package. This fmriprep-docker
package is just a simple wrapper around the appropriate Docker command to run the complicated “containerized” Fmriprep command. We strongly recommend this method.
To install fmriprep-docker
, you can use pip
(from your command line):
pip install fmriprep-docker
Now, you should have access to the fmriprep-docker
command on your command line and you’re ready to start preprocessing your dataset. For more detailed information about installing Fmriprep, check out their website.
Running Fmriprep#
Assuming you have Docker and fmriprep-docker
installed, you’re ready to run Fmriprep. The basic format of the fmriprep-docker
command is as follows:
fmriprep-docker <your bids-folder> <your output-folder>
This means that fmriprep-docker
has two mandatory positional arguments: the first one being your BIDS-folder (i.e., the path to your folder with BIDS-formattefd data), and the second one being the output-folder (i.e., where you want Fmriprep to output the preprocessed data). We recommend setting your output-folder to a subfolder of your BIDS-folder named “derivatives”: <your bids-folder>/derivatives
.
Then, you can add a bunch of extra “flags” (parameters) to the command to specify the preprocessing pipeline as you like it. We highlight a couple of important ones here, but for the full list of parameters, check out the Fmriprep website.
Freesurfer#
When running Fmriprep from Docker, you don’t need to have Freesurfer installed, but you do need a Freesurfer license. You can download this here: https://surfer.nmr.mgh.harvard.edu/fswiki/License. Then, you need to supply the --fs-license-file <path to license file>
parameter to your fmriprep-docker
command:
fmriprep-docker <your bids-folder> <your output-folder> --fs-license-file /home/lukas/license.txt
Configuring what is preprocessed#
If you just run Fmriprep with the mandatory BIDS-folder and output-folder arguments, it will preprocess everything it finds in the BIDS-folder. Sometimes, however, you may just want to run one (or several) specific participants, or one (or more) specific tasks (e.g., only the MRI files associated with the localizer runs, but not the working memory runs). You can do this by adding the --participant
and --task
flags to the command:
fmriprep-docker <your bids-folder> <your output-folder> --participant sub-01 --task localizer
You can also specify some things to be ignored during preprocessing using the --ignore
parameters (like fieldmaps
):
fmriprep-docker <your bids-folder> <your output-folder> --ignore fieldmaps
Handling performance#
It’s very easy to parallelize the preprocessing pipeline by setting the --nthreads
and --omp-nthreads
parameters, which refer to the number of threads that should be used to run Fmriprep on. Note that laptops usually have 4 threads available (but analysis servers usually have more!). You can also specify the maximum of RAM that Fmriprep is allowed to use by the --mem_mb
parameters. So, if you for example want to run Fmriprep with 3 threads and a maximum of 3GB of RAM, you can run:
fmriprep-docker <your bids-folder> <your output-folder> --nthreads 3 --omp-nthreads 3 --mem_mb 3000
In our experience, however, specifying the --mem_mb
parameter is rarely necessary if you don’t parallelize too much.
Output spaces#
Specifying your “output spaces” (with the --output-spaces
flag) tells Fmriprep to what “space(s)” you want your preprocessed data registered to. For example, you can specify T1w
to have your functional data registered to the participant’s T1 scan. You can, instead or in addition to, also specify some standard template, like the MNI template (MNI152NLin2009cAsym
or MNI152NLin6Asym
). You can even specify surface templates if you want (like fsaverage
), which will sample your volumetric functional data onto the surface (as computed by freesurfer). In addition to the specific output space(s), you can add a resolution “modifier” to the parameter to specify in what spatial resolution you want your resampled data to be. Without any resolution modifier, the native resolution of your functional files (e.g., \(3\times3\times3\) mm.) will be kept intact. But if you want to upsample your resampled files to 2mm, you can add YourTemplate:2mm
. For example, if you want to use the FSL-style MNI template (MNI152NLin6Asym
) resampled at 2 mm, you’d use:
fmriprep-docker <your bids-folder> <your output-folder> --output-spaces MNI152NLin6Asym:2mm
You can of course specify multiple output-spaces:
fmriprep-docker <your bids-folder> <your output-folder> --output-spaces MNI152NLin6Asym:2mm T1w fsaverage
Other parameters#
There are many options that you can set when running Fmriprep. Check out the Fmriprep website (under “Usage”) for a list of all options!
Issues, errors, and troubleshooting#
While Fmriprep often works out-of-the-box (assuming your data are properly BIDS-formatted), it may happen that it crashes or otherwise gives unexpected results. A great place to start looking for help is neurostars.org. This website is dedicated to helping neuroscientists with neuroimaging/neuroscience-related questions. Make sure to check whether your question has been asked here already and, if not, pose it here!
If you encounter Fmriprep-specific bugs, you can also submit and issue at the Github repository of Fmriprep.
Fmriprep output/reports#
After Fmriprep has run, it outputs, for each participants separately, a directory with results (i.e., preprocessed files) and an HTML-file with a summary and figures of the different steps in the preprocessing pipeline.
We ran Fmriprep on a single run/task (flocBLOCKED
) from a single subject (sub-03
) some data with the following command:
fmriprep-docker /home/lsnoek1/ni-edu/bids /home/lsnoek1/ni-edu/bids/derivatives --participant-label sub-03 --output-spaces T1w MNI152NLin2009cAsym
We’ve copied the Fmriprep output for this subject (sub-03
) in the fmriprep
subdirectory of the week_4
directory. Let’s check its contents:
import os
print(os.listdir('bids/derivatives/fmriprep'))
['dataset_description.json', 'desc-aparcaseg_dseg.tsv', 'logs', 'desc-aseg_dseg.tsv', 'sub-03.html', 'sub-03']
As said, Fmriprep outputs a directory with results (sub-03
) and an associated HTML-file with a summary of the (intermediate and final) results. Let’s check the directory with results first:
from pprint import pprint # pprint stands for "pretty print",
sub_path = os.path.join('bids/derivatives/fmriprep', 'sub-03')
pprint(sorted(os.listdir(sub_path)))
['anat', 'figures', 'func', 'log']
The figures
directory contains several figures with the result of different preprocessing stages (like functional → high-res anatomical registration), but these figures are also included in the HTML-file, so we’ll leave that for now. The other two directories, anat
and func
, contain the preprocessed anatomical and functional files, respectively. Let’s inspect the anat
directory:
anat_path = os.path.join(sub_path, 'anat')
pprint(os.listdir(anat_path))
['sub-03_label-WM_probseg.nii.gz',
'sub-03_space-MNI152NLin2009cAsym_desc-brain_mask.json',
'sub-03_space-MNI152NLin2009cAsym_label-GM_probseg.nii.gz',
'sub-03_desc-preproc_T1w.nii.gz',
'sub-03_space-MNI152NLin2009cAsym_desc-preproc_T1w.json',
'sub-03_from-MNI152NLin2009cAsym_to-T1w_mode-image_xfm.h5',
'sub-03_desc-brain_mask.nii.gz',
'sub-03_desc-brain_mask.json',
'sub-03_label-CSF_probseg.nii.gz',
'sub-03_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz',
'sub-03_desc-aparcaseg_dseg.nii.gz',
'sub-03_desc-aseg_dseg.nii.gz',
'sub-03_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz',
'sub-03_space-MNI152NLin2009cAsym_label-CSF_probseg.nii.gz',
'sub-03_space-MNI152NLin2009cAsym_label-WM_probseg.nii.gz',
'sub-03_dseg.nii.gz',
'sub-03_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5',
'sub-03_space-MNI152NLin2009cAsym_dseg.nii.gz',
'sub-03_label-GM_probseg.nii.gz',
'sub-03_desc-preproc_T1w.json']
Here, we see a couple of different files. There are both (preprocessed) nifti images (*.nii.gz
) and associated meta-data (plain-text files in JSON format: *.json
).
Importantly, the nifti outputs are in two different spaces: one set of files are in the original “T1 space”, so without any resampling to another space (these files have the same resolution and orientation as the original T1 anatomical scan). For example, the sub_03_desc-preproc_T1w.nii.gz
scan is the preprocessed (i.e., bias-corrected) T1 scan. In addition, most files are also available in MNI152NLin2009cAsym
space, a standard template. For example, the sub-03_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz
is the same file as sub_03_desc-preproc_T1w.nii.gz
, but resampled to the MNI152NLin2009cAsym
template. In addition, there are subject-specific brain parcellations (the *aparcaseg_dseg.nii.gz
and *aseg_dseg.nii.gz
files), files with registration parameters (*from- ... -to ...
files), probabilistic tissue segmentation files (*label-{CSF,GM,WM}_probseg.nii.gz
) files, and brain masks (to outline what is brain and not skull/dura/etc; *brain_mask.nii.gz
).
Again, on the Fmriprep website, you can find more information about the specific outputs.
Now, let’s check out the func
directory:
func_path = os.path.join(sub_path, 'func')
pprint(os.listdir(func_path))
['sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz',
'sub-03_task-flocBLOCKED_space-T1w_desc-preproc_bold.json',
'sub-03_task-flocBLOCKED_space-T1w_desc-preproc_bold.nii.gz',
'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-brain_mask.json',
'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-preproc_bold.json',
'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_boldref.nii.gz',
'sub-03_task-flocBLOCKED_space-T1w_desc-aseg_dseg.nii.gz',
'sub-03_task-flocBLOCKED_space-T1w_desc-aparcaseg_dseg.nii.gz',
'sub-03_task-flocBLOCKED_space-T1w_desc-brain_mask.nii.gz',
'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-aseg_dseg.nii.gz',
'sub-03_task-flocBLOCKED_space-T1w_desc-brain_mask.json',
'sub-03_task-flocBLOCKED_space-T1w_boldref.nii.gz',
'sub-03_task-flocBLOCKED_desc-confounds_regressors.json',
'sub-03_task-flocBLOCKED_desc-confounds_regressors.tsv',
'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-aparcaseg_dseg.nii.gz']
Again, like the files in the anat
folder, the functional outputs are available in two spaces: T1w
and MNI152NLin2009cAsym
. In terms of actual images, there are preprocessed BOLD files (ending in preproc_bold.nii.gz
), the functional volume used for “functional → anatomical” registration (ending in boldref.nii.gz
), brain parcellations in functional space (ending in dseg.nii.gz
), and brain masks (ending in brain_mask.nii.gz
). In addition, there are files with “confounds” (ending in confounds_regressors.tsv
) which contain variables that you might want to include as nuisance regressors in your first-level analysis. These confound files are speadsheet-like files (like csv
files, but instead of being comma-delimited, they are tab-delimited) and can be easily loaded in Python using the pandas package:
import pandas as pd
conf_path = os.path.join(func_path, 'sub-03_task-flocBLOCKED_desc-confounds_regressors.tsv')
conf = pd.read_csv(conf_path, sep='\t')
conf.head()
csf | csf_derivative1 | csf_derivative1_power2 | csf_power2 | white_matter | white_matter_derivative1 | white_matter_derivative1_power2 | white_matter_power2 | global_signal | global_signal_derivative1 | ... | rot_x_derivative1_power2 | rot_x_power2 | rot_y | rot_y_derivative1 | rot_y_power2 | rot_y_derivative1_power2 | rot_z | rot_z_derivative1 | rot_z_power2 | rot_z_derivative1_power2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 47509.215863 | NaN | NaN | 2.257126e+09 | 33060.738398 | NaN | NaN | 1.093012e+09 | 30699.602599 | NaN | ... | NaN | 0.000000e+00 | -0.000149 | NaN | 2.208969e-08 | NaN | 0.000000 | NaN | 0.000000e+00 | NaN |
1 | 47183.542928 | -325.672935 | 106062.860766 | 2.226287e+09 | 33021.566030 | -39.172367 | 1534.474371 | 1.090424e+09 | 30659.901226 | -39.701373 | ... | 3.384482e-09 | 3.384482e-09 | -0.000000 | 0.000149 | 0.000000e+00 | 2.208969e-08 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 |
2 | 47116.085111 | -67.457818 | 4550.557152 | 2.219925e+09 | 33110.393546 | 88.827516 | 7890.327596 | 1.096298e+09 | 30679.787487 | 19.886260 | ... | 1.956213e-08 | 3.922024e-08 | -0.000158 | -0.000158 | 2.510704e-08 | 2.510704e-08 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 |
3 | 46920.666512 | -195.418599 | 38188.428865 | 2.201549e+09 | 33027.273113 | -83.120433 | 6909.006414 | 1.090801e+09 | 30670.787218 | -9.000269 | ... | 7.606564e-12 | 4.032024e-08 | -0.000202 | -0.000043 | 4.074342e-08 | 1.883386e-09 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000e+00 |
4 | 46876.911781 | -43.754731 | 1914.476452 | 2.197445e+09 | 33013.589153 | -13.683960 | 187.250764 | 1.089897e+09 | 30636.632976 | -34.154242 | ... | 4.032024e-08 | 4.379051e-47 | -0.000187 | 0.000015 | 3.485764e-08 | 2.294619e-10 | -0.000219 | -0.000219 | 4.783144e-08 | 4.783144e-08 |
5 rows × 263 columns
Confound files from Fmriprep contain a large set of confounds, ranging from motion parameters (rot_x
, rot_y
, rot_z
, trans_x
, trans_y
, and trans_z
) and their derivatives (*derivative1
) and squares (*_power2
) to the average signal from the brain’s white matter and cerebrospinal fluid (CSF), which should contain sources of noise such as respiratory, cardiac, or motion related signals (but not signal from neural sources, which should be largely constrained to gray matter). For a full list and explanation of Fmriprep’s estimated confounds, check their website. Also, check this thread on Neurostars for a discussion on which confounds to include in your analyses.
In addition to the actual preprocessed outputs, Fmriprep also provides you with a nice (visual) summary of the different (major) preprocessing steps in an HTML-file, which you’d normally open in any standard browser to view. Here. we load this file for our example participants (sub-03
) inside the notebook below. Scroll through it to see which preprocessing steps are highlighted. Note that the images from the HTML-file are not properly rendered in Jupyter notebooks, but you can right-click the image links (e.g., sub-03/figures/sub-03_dseg.svg
) and click “Open link in new tab” to view the image.
from IPython.display import IFrame
IFrame(src='./bids/derivatives/fmriprep/sub-03.html', width=700, height=600)