Fmriprep#

Today, many excellent general-purpose, open-source neuroimaging software packages exist: SPM (Matlab-based), FSL, AFNI, and Freesurfer (with a shell interface). We argue that there is not one single package that is always the best choice for every step in your preprocessing pipeline. Fortunately, people from the Poldrack lab created fmriprep, a software package that offers a preprocessing pipeline which “glues together” functionality from different neuroimaging software packages (such as Freesurfer and FSL), such that each step in the pipeline is executed by the software package that (arguably) does it best.

We have been using Fmriprep for preprocessing of our own data and we strongly recommend it. It is relatively simple to use, requires minimal user intervention, and creates extensive visual reports for users to do visual quality control (to check whether each step in the pipeline worked as expected). The only requirement to use Fmriprep is that your data is formatted as specified in the Brain Imaging Data Structure (BIDS).

The BIDS-format#

BIDS is a specification on how to format, name, and organize your MRI dataset. It specifies the file format of MRI files (i.e., compressed Nifti: .nii.gz files), lays out rules for how you should name your files (i.e., with “key-value” pairs, such as: sub-01_ses-1_task-1back_run-1_bold.nii.gz), and outlines the file/folder structure of your dataset (where each subject has its own directory with separate subdirectories for different MRI modalities, including fieldmaps, functional, diffusion, and anatomical MRI). Additionally, it specifies a way to include “metadata” about the (MRI) files in your dataset with JSON files: plain-text files with key-value pairs (in the form “parameter: value”). Given that your dataset is BIDS-formatted and contains the necessary metadata, you can use fmriprep on your dataset. (You can use the awesome bids-validator to see whether your dataset is completely valid according to BIDS.)

There are different tools to convert your “raw” scanner data (e.g., in DICOM or PAR/REC format) to BIDS, including heudiconv, bidscoin, and bidsify (created by Lukas). We’ll skip over this step and assume that you’ll be able to convert your data to BIDS.

Installing Fmriprep#

Now, having your data in BIDS is an important step in getting started with Fmriprep. The next step is installing the package. Technically, Fmriprep is a Python package, so it can be installed as such (using pip install fmriprep), but we do not recommend this “bare metal” installation, because it depends on a host of neuroimaging software packages (including FSL, Freesurfer, AFNI, and ANTs). So if you’d want to directly install Fmriprep, you’d need to install those extra neuroimaging software packages as well (which is not worth your time, trust us).

Fortunately, Fmriprep also offers a “Docker container” in which Fmriprep and all the associated dependencies are already installed. Docker is software that allows you to create “containers”, which are like lightweight “virtual machines” (VM) that are like a separate (Linux-based) operating system with a specific software configuration. You can download the Fmriprep-specific docker “image”, which is like a “recipe”, build the Fmriprep-specific “container” according to this “recipe” on your computer, and finally use this container to run Fmriprep on your computer as if all dependencies were actually installed on your computer! Docker is available on Linux, Mac, and Windows. To install Docker, google something like “install docker for {Windows,Mac,Linux}” to find a google walkthrough.

Note that you need administrator (“root”) privilege on your computer (which is likely the case for your own computer, but not on shared analysis servers) to run Docker. If you don’t have root access on your computer/server, ask you administrator/sysadmin to install singularity, which allows you to convert Docker images to Singularity images, which you can run without administrator privileges.

Assuming you have installed Docker, you can run the “containerized” Fmriprep from your command line directly, which involves a fairly long and complicated command (i.e., docker run -it --rm -v bids_dir /data ... etc), or using the fmriprep-docker Python package. This fmriprep-docker package is just a simple wrapper around the appropriate Docker command to run the complicated “containerized” Fmriprep command. We strongly recommend this method.

To install fmriprep-docker, you can use pip (from your command line):

pip install fmriprep-docker

Now, you should have access to the fmriprep-docker command on your command line and you’re ready to start preprocessing your dataset. For more detailed information about installing Fmriprep, check out their website.

Running Fmriprep#

Assuming you have Docker and fmriprep-docker installed, you’re ready to run Fmriprep. The basic format of the fmriprep-docker command is as follows:

fmriprep-docker <your bids-folder> <your output-folder> 

This means that fmriprep-docker has two mandatory positional arguments: the first one being your BIDS-folder (i.e., the path to your folder with BIDS-formattefd data), and the second one being the output-folder (i.e., where you want Fmriprep to output the preprocessed data). We recommend setting your output-folder to a subfolder of your BIDS-folder named “derivatives”: <your bids-folder>/derivatives.

Then, you can add a bunch of extra “flags” (parameters) to the command to specify the preprocessing pipeline as you like it. We highlight a couple of important ones here, but for the full list of parameters, check out the Fmriprep website.

Freesurfer#

When running Fmriprep from Docker, you don’t need to have Freesurfer installed, but you do need a Freesurfer license. You can download this here: https://surfer.nmr.mgh.harvard.edu/fswiki/License. Then, you need to supply the --fs-license-file <path to license file> parameter to your fmriprep-docker command:

fmriprep-docker <your bids-folder> <your output-folder> --fs-license-file /home/lukas/license.txt

Configuring what is preprocessed#

If you just run Fmriprep with the mandatory BIDS-folder and output-folder arguments, it will preprocess everything it finds in the BIDS-folder. Sometimes, however, you may just want to run one (or several) specific participants, or one (or more) specific tasks (e.g., only the MRI files associated with the localizer runs, but not the working memory runs). You can do this by adding the --participant and --task flags to the command:

fmriprep-docker <your bids-folder> <your output-folder> --participant sub-01 --task localizer

You can also specify some things to be ignored during preprocessing using the --ignore parameters (like fieldmaps):

fmriprep-docker <your bids-folder> <your output-folder> --ignore fieldmaps

Handling performance#

It’s very easy to parallelize the preprocessing pipeline by setting the --nthreads and --omp-nthreads parameters, which refer to the number of threads that should be used to run Fmriprep on. Note that laptops usually have 4 threads available (but analysis servers usually have more!). You can also specify the maximum of RAM that Fmriprep is allowed to use by the --mem_mb parameters. So, if you for example want to run Fmriprep with 3 threads and a maximum of 3GB of RAM, you can run:

fmriprep-docker <your bids-folder> <your output-folder> --nthreads 3 --omp-nthreads 3 --mem_mb 3000

In our experience, however, specifying the --mem_mb parameter is rarely necessary if you don’t parallelize too much.

Output spaces#

Specifying your “output spaces” (with the --output-spaces flag) tells Fmriprep to what “space(s)” you want your preprocessed data registered to. For example, you can specify T1w to have your functional data registered to the participant’s T1 scan. You can, instead or in addition to, also specify some standard template, like the MNI template (MNI152NLin2009cAsym or MNI152NLin6Asym). You can even specify surface templates if you want (like fsaverage), which will sample your volumetric functional data onto the surface (as computed by freesurfer). In addition to the specific output space(s), you can add a resolution “modifier” to the parameter to specify in what spatial resolution you want your resampled data to be. Without any resolution modifier, the native resolution of your functional files (e.g., \(3\times3\times3\) mm.) will be kept intact. But if you want to upsample your resampled files to 2mm, you can add YourTemplate:2mm. For example, if you want to use the FSL-style MNI template (MNI152NLin6Asym) resampled at 2 mm, you’d use:

fmriprep-docker <your bids-folder> <your output-folder> --output-spaces MNI152NLin6Asym:2mm

You can of course specify multiple output-spaces:

fmriprep-docker <your bids-folder> <your output-folder> --output-spaces MNI152NLin6Asym:2mm T1w fsaverage

Other parameters#

There are many options that you can set when running Fmriprep. Check out the Fmriprep website (under “Usage”) for a list of all options!

Issues, errors, and troubleshooting#

While Fmriprep often works out-of-the-box (assuming your data are properly BIDS-formatted), it may happen that it crashes or otherwise gives unexpected results. A great place to start looking for help is neurostars.org. This website is dedicated to helping neuroscientists with neuroimaging/neuroscience-related questions. Make sure to check whether your question has been asked here already and, if not, pose it here!

If you encounter Fmriprep-specific bugs, you can also submit and issue at the Github repository of Fmriprep.

Fmriprep output/reports#

After Fmriprep has run, it outputs, for each participants separately, a directory with results (i.e., preprocessed files) and an HTML-file with a summary and figures of the different steps in the preprocessing pipeline.

We ran Fmriprep on a single run/task (flocBLOCKED) from a single subject (sub-03) some data with the following command:

fmriprep-docker /home/lsnoek1/ni-edu/bids /home/lsnoek1/ni-edu/bids/derivatives --participant-label sub-03 --output-spaces T1w MNI152NLin2009cAsym

We’ve copied the Fmriprep output for this subject (sub-03) in the fmriprep subdirectory of the week_4 directory. Let’s check its contents:

import os
print(os.listdir('bids/derivatives/fmriprep'))
['dataset_description.json', 'desc-aparcaseg_dseg.tsv', 'logs', 'desc-aseg_dseg.tsv', 'sub-03.html', 'sub-03']

As said, Fmriprep outputs a directory with results (sub-03) and an associated HTML-file with a summary of the (intermediate and final) results. Let’s check the directory with results first:

from pprint import pprint  # pprint stands for "pretty print", 

sub_path = os.path.join('bids/derivatives/fmriprep', 'sub-03')
pprint(sorted(os.listdir(sub_path)))
['anat', 'figures', 'func', 'log']

The figures directory contains several figures with the result of different preprocessing stages (like functional → high-res anatomical registration), but these figures are also included in the HTML-file, so we’ll leave that for now. The other two directories, anat and func, contain the preprocessed anatomical and functional files, respectively. Let’s inspect the anat directory:

anat_path = os.path.join(sub_path, 'anat')
pprint(os.listdir(anat_path))
['sub-03_label-WM_probseg.nii.gz',
 'sub-03_space-MNI152NLin2009cAsym_desc-brain_mask.json',
 'sub-03_space-MNI152NLin2009cAsym_label-GM_probseg.nii.gz',
 'sub-03_desc-preproc_T1w.nii.gz',
 'sub-03_space-MNI152NLin2009cAsym_desc-preproc_T1w.json',
 'sub-03_from-MNI152NLin2009cAsym_to-T1w_mode-image_xfm.h5',
 'sub-03_desc-brain_mask.nii.gz',
 'sub-03_desc-brain_mask.json',
 'sub-03_label-CSF_probseg.nii.gz',
 'sub-03_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz',
 'sub-03_desc-aparcaseg_dseg.nii.gz',
 'sub-03_desc-aseg_dseg.nii.gz',
 'sub-03_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz',
 'sub-03_space-MNI152NLin2009cAsym_label-CSF_probseg.nii.gz',
 'sub-03_space-MNI152NLin2009cAsym_label-WM_probseg.nii.gz',
 'sub-03_dseg.nii.gz',
 'sub-03_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5',
 'sub-03_space-MNI152NLin2009cAsym_dseg.nii.gz',
 'sub-03_label-GM_probseg.nii.gz',
 'sub-03_desc-preproc_T1w.json']

Here, we see a couple of different files. There are both (preprocessed) nifti images (*.nii.gz) and associated meta-data (plain-text files in JSON format: *.json).

Importantly, the nifti outputs are in two different spaces: one set of files are in the original “T1 space”, so without any resampling to another space (these files have the same resolution and orientation as the original T1 anatomical scan). For example, the sub_03_desc-preproc_T1w.nii.gz scan is the preprocessed (i.e., bias-corrected) T1 scan. In addition, most files are also available in MNI152NLin2009cAsym space, a standard template. For example, the sub-03_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz is the same file as sub_03_desc-preproc_T1w.nii.gz, but resampled to the MNI152NLin2009cAsym template. In addition, there are subject-specific brain parcellations (the *aparcaseg_dseg.nii.gz and *aseg_dseg.nii.gz files), files with registration parameters (*from- ... -to ... files), probabilistic tissue segmentation files (*label-{CSF,GM,WM}_probseg.nii.gz) files, and brain masks (to outline what is brain and not skull/dura/etc; *brain_mask.nii.gz).

Again, on the Fmriprep website, you can find more information about the specific outputs.

Now, let’s check out the func directory:

func_path = os.path.join(sub_path, 'func')
pprint(os.listdir(func_path))
['sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz',
 'sub-03_task-flocBLOCKED_space-T1w_desc-preproc_bold.json',
 'sub-03_task-flocBLOCKED_space-T1w_desc-preproc_bold.nii.gz',
 'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
 'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-brain_mask.json',
 'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-preproc_bold.json',
 'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_boldref.nii.gz',
 'sub-03_task-flocBLOCKED_space-T1w_desc-aseg_dseg.nii.gz',
 'sub-03_task-flocBLOCKED_space-T1w_desc-aparcaseg_dseg.nii.gz',
 'sub-03_task-flocBLOCKED_space-T1w_desc-brain_mask.nii.gz',
 'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-aseg_dseg.nii.gz',
 'sub-03_task-flocBLOCKED_space-T1w_desc-brain_mask.json',
 'sub-03_task-flocBLOCKED_space-T1w_boldref.nii.gz',
 'sub-03_task-flocBLOCKED_desc-confounds_regressors.json',
 'sub-03_task-flocBLOCKED_desc-confounds_regressors.tsv',
 'sub-03_task-flocBLOCKED_space-MNI152NLin2009cAsym_desc-aparcaseg_dseg.nii.gz']

Again, like the files in the anat folder, the functional outputs are available in two spaces: T1w and MNI152NLin2009cAsym. In terms of actual images, there are preprocessed BOLD files (ending in preproc_bold.nii.gz), the functional volume used for “functional → anatomical” registration (ending in boldref.nii.gz), brain parcellations in functional space (ending in dseg.nii.gz), and brain masks (ending in brain_mask.nii.gz). In addition, there are files with “confounds” (ending in confounds_regressors.tsv) which contain variables that you might want to include as nuisance regressors in your first-level analysis. These confound files are speadsheet-like files (like csv files, but instead of being comma-delimited, they are tab-delimited) and can be easily loaded in Python using the pandas package:

import pandas as pd
conf_path = os.path.join(func_path, 'sub-03_task-flocBLOCKED_desc-confounds_regressors.tsv')
conf = pd.read_csv(conf_path, sep='\t')
conf.head()
csf csf_derivative1 csf_derivative1_power2 csf_power2 white_matter white_matter_derivative1 white_matter_derivative1_power2 white_matter_power2 global_signal global_signal_derivative1 ... rot_x_derivative1_power2 rot_x_power2 rot_y rot_y_derivative1 rot_y_power2 rot_y_derivative1_power2 rot_z rot_z_derivative1 rot_z_power2 rot_z_derivative1_power2
0 47509.215863 NaN NaN 2.257126e+09 33060.738398 NaN NaN 1.093012e+09 30699.602599 NaN ... NaN 0.000000e+00 -0.000149 NaN 2.208969e-08 NaN 0.000000 NaN 0.000000e+00 NaN
1 47183.542928 -325.672935 106062.860766 2.226287e+09 33021.566030 -39.172367 1534.474371 1.090424e+09 30659.901226 -39.701373 ... 3.384482e-09 3.384482e-09 -0.000000 0.000149 0.000000e+00 2.208969e-08 0.000000 0.000000 0.000000e+00 0.000000e+00
2 47116.085111 -67.457818 4550.557152 2.219925e+09 33110.393546 88.827516 7890.327596 1.096298e+09 30679.787487 19.886260 ... 1.956213e-08 3.922024e-08 -0.000158 -0.000158 2.510704e-08 2.510704e-08 0.000000 0.000000 0.000000e+00 0.000000e+00
3 46920.666512 -195.418599 38188.428865 2.201549e+09 33027.273113 -83.120433 6909.006414 1.090801e+09 30670.787218 -9.000269 ... 7.606564e-12 4.032024e-08 -0.000202 -0.000043 4.074342e-08 1.883386e-09 0.000000 0.000000 0.000000e+00 0.000000e+00
4 46876.911781 -43.754731 1914.476452 2.197445e+09 33013.589153 -13.683960 187.250764 1.089897e+09 30636.632976 -34.154242 ... 4.032024e-08 4.379051e-47 -0.000187 0.000015 3.485764e-08 2.294619e-10 -0.000219 -0.000219 4.783144e-08 4.783144e-08

5 rows × 263 columns

Confound files from Fmriprep contain a large set of confounds, ranging from motion parameters (rot_x, rot_y, rot_z, trans_x, trans_y, and trans_z) and their derivatives (*derivative1) and squares (*_power2) to the average signal from the brain’s white matter and cerebrospinal fluid (CSF), which should contain sources of noise such as respiratory, cardiac, or motion related signals (but not signal from neural sources, which should be largely constrained to gray matter). For a full list and explanation of Fmriprep’s estimated confounds, check their website. Also, check this thread on Neurostars for a discussion on which confounds to include in your analyses.

In addition to the actual preprocessed outputs, Fmriprep also provides you with a nice (visual) summary of the different (major) preprocessing steps in an HTML-file, which you’d normally open in any standard browser to view. Here. we load this file for our example participants (sub-03) inside the notebook below. Scroll through it to see which preprocessing steps are highlighted. Note that the images from the HTML-file are not properly rendered in Jupyter notebooks, but you can right-click the image links (e.g., sub-03/figures/sub-03_dseg.svg) and click “Open link in new tab” to view the image.

from IPython.display import IFrame
IFrame(src='./bids/derivatives/fmriprep/sub-03.html', width=700, height=600)