{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Statistics with Nilearn\n", "This notebook is about the GLM/statistics-related functionality of the [nilearn](https://nilearn.github.io/) Python package for statistical analysis of fMRI data. While it is still in development, it promises to become a full-fledged Python-based alternative to existing (f)MRI analysis software packages such as FSL, SPM, and AFNI.\n", "\n", "In this notebook, we'll showcase the most important features of the statistics-related functionality of the package in a step-by-step fashion. Notably, this notebook contains several exercises (which we call \"ToDos\"), which are meant to make this tutorial more interactive! Also, this tutorial is merely an introduction to (parts of) the Nilearn package. We strongly recommend checking out the excellent [user guide](https://nilearn.github.io/user_guide.html) and [example gallery](https://nilearn.github.io/auto_examples/index.html) on the Nilearn website if you want to delve deeper into the package's (more advanced) features.\n", "\n", "While not strictly necessary, you'll get the most out of this tutorial if you are familiar with the Nilearn package. To familiarize yourself, you could go through our [Nilearn tutorial](https://github.com/lukassnoek/nilearn-tutorial). (For students doing either of the *Neuroimaging* courses at the University of Amsterdam: the nilearn.ipynb notebook should be in your home-folder already.) Also, this notebook contains a very short introduction to the [pandas](https://pandas.pydata.org/) package, but this can be skipped by those who are already familiar with it.\n", "\n", "**Contents**\n", "1. What is Nilearn?\n", "2. Data formats\n", "3. Creating design matrices\n", "4. First-level models\n", "5. Second-level models\n", "\n", "**Estimated time needed to complete**: 1-3 hours (depending on your experience with Python)
\n", "**Credits**: if you end up using nilearn in your work, please cite the corresponding [article](https://www.frontiersin.org/articles/10.3389/fninf.2014.00014/full).
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Install packages if necessary\n", "try:\n", " import nilearn\n", "except ImportError:\n", " !pip install nilearn\n", "\n", "# We need to limit the amount of threads numpy can use, otherwise\n", "# it tends to hog all the CPUs available when using Nilearn\n", "import os\n", "os.environ['MKL_NUM_THREADS'] = '1'\n", "os.environ['OPENBLAS_NUM_THREADS'] = '1'\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Warning: This notebook uses a lot of RAM (up to 8GB at some point), so make sure your computer/server can handle this!\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Nilearn?\n", "Nilearn is one of the packages in the growing \"nipy\" ecosystem of Python packages for neuroimaging analysis (see also MNE, nilearn, nipype, nibabel, and dipy). One of the (recently added) features of the package is the ability to run statistical (mostly univariate) analyses of fMRI data. (This functionality was previously implemented in the \"Nistats\" package, which has been merged into Nilearn recently.) Importantly, Nilearn does not contain functionality to preprocess your (f)MRI data and assumes that your data has been preprocessed by another software package. Personally, we think it works very well in combination with preprocessed data using the [Fmriprep](https://fmriprep.readthedocs.io/) package." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Note: Nilearn is a relatively new package and its API might change! If code in this notebook gives errors because it uses an old version of Nilearn, let us know. \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most statistical functionality in Nilearn is stored in the glm module (i.e., nilearn.glm), which itself contains submodules for first-level analyses (in nilearn.glm.first_level) and higher-level analyses (in nilearn.glm.second_level)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data formats\n", "The two most important data formats that are used in the Nilearn package are nifti images and comma- or tab-separated values (CSV, TSV) text files.\n", "\n", "### Nifti images\n", "Like most packages in the *nipy* ecosystem, Nilearn assumes that your MRI data is stored in nifti images, and as such, many functions in Nilearn involving nifti images accept either strings pointing towards the path of a nifti file (or a list with multiple paths) or a Nifti1Image object from the nibabel package. Together, these two types of inputs (filenames pointing to nifti files and Nifti1Images) are often referred to a \"niimgs\" (or \"niimg-like\") by Nilearn — a term you'll see a lot in the documentation.\n", "\n", "Let's actually download some data! In this tutorial, we use data from the [NARPS](https://www.narps.info/) project ([Botvinik-Nezer et al., 2019a](https://www.nature.com/articles/s41597-019-0113-7); [Botvinik-Nezer et al., 2019b](https://www.biorxiv.org/content/10.1101/843193v1)). This public dataset was analyzed by 70 different research groups to showcase the variety in analysis approaches and the way this affects the subsequent results. In the study, a \"mixed gambles\" experiment was used to investigate how potential monetary gains and losses are related to brain activity (which was based on the experiment by [Tom et al., 2007](https://science.sciencemag.org/content/315/5811/515)). \n", "\n", "Each trial, participants were presented simulatenously with a potential monetary \"gain\" (e.g., +12) and a potential monetary \"loss\" (e.g., -6. For each trial, participants had to choose whether to accept this gamble (either \"strongly accept\", \"weakly accept\", \"weakly reject\", or \"strongly reject\"), knowing that at the end of the experiment, one trial (\"gamble\") would be picked randomly and (if chosen) would be \"run\" with a 50/50 chance of losing the original \"loss\" amount and winning the original \"gain\" amount. As such, the following information was recorded per trial: potential gain amount, potential loss amount, reaction time (of participant's response), and the participant's choice/decision. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Go to the NARPS website and read through the \"data and analysis\" section to get an idea of the experiment and scanning procedure.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The dataset is publicly available from [Openneuro.org](http://openneuro.org/), which also includes preprocessed data (using [Fmriprep](https://fmriprep.readthedocs.io/)) — perfect for our purposes! The cell below will download the data, which may take a while because it's 2.4 GB in size." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# First, we need the awscli package to download the data; install if necessary\n", "try:\n", " import awscli\n", "except ImportError:\n", " print(\"Installing awscli package ...\")\n", " !pip install awscli\n", "\n", "# We'll save the data in your home folder\n", "# Note: the os.path.join function concatenates paths using the delimiter specific\n", "# to your operating system (\\ for windows, / for Mac/Linux)\n", "import os \n", "save_dir = os.path.join(os.path.expanduser(\"~\"), 'NARPS')\n", "if not os.path.isdir(save_dir):\n", " print(\"Data will be saved in %s ...\" % save_dir)\n", " print(\"Go get some coffee. This may take a while.\\n\")\n", "\n", " # Using the CLI interface of the awscli Python package, we'll download the data\n", " # Note that we only download the preprocessed data from a single run from a single subject\n", " # This may take about 10-60 minutes or so (depending on your download speed)\n", " !aws s3 sync --no-sign-request s3://openneuro.org/ds001734 {save_dir} --exclude \"*\" --include \"*fmriprep/sub-00[1,3]/func/*run-01*space-MNI*.nii.gz\"\n", " !aws s3 sync --no-sign-request s3://openneuro.org/ds001734 {save_dir} --exclude \"*\" --include \"*fmriprep/sub-00[1,3]/func/*run-01*.tsv\"\n", " !aws s3 sync --no-sign-request s3://openneuro.org/ds001734 {save_dir} --exclude \"*\" --include \"*sub-00[1,3]/func/*run-01*events.tsv\"\n", " print(\"\\nDone!\")\n", "else:\n", " print(\"Data is already downloaded!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, let's check out the directory in which we saved the downloaded data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def list_files(startpath):\n", " \"\"\" Simple function to show directory tree. \n", " From: https://stackoverflow.com/questions/9727673/list-directory-tree-structure-in-python. \"\"\"\n", " for root, dirs, files in os.walk(startpath):\n", " level = root.replace(startpath, '').count(os.sep)\n", " indent = ' ' * 4 * (level)\n", " print('{}{}/'.format(indent, os.path.basename(root)))\n", " subindent = ' ' * 4 * (level + 1)\n", " for f in sorted(files):\n", " print('{}{}'.format(subindent, f))\n", " \n", "list_files(save_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, this directory contains both subdirectories with \"unprocessed\" data (in sub-001/func) and preprocessed data (in derivatives/fmriprep/sub-001/func). Note that we excluded the raw (unprocessed) MRI data to save disk space (and time).\n", "\n", "Now, let's check out an fMRI file. We'll load the preprocessed sub-001_task-MGT_run-01_bold_space-MNI152NLin2009cAsym_preproc.nii.gz (which has been registered and resampled to standard MNI152 space already) using nibabel:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import nibabel as nib\n", "fmri_path = os.path.join(save_dir, 'derivatives', 'fmriprep', 'sub-001', 'func', 'sub-001_task-MGT_run-01_bold_space-MNI152NLin2009cAsym_preproc.nii.gz')\n", "if not os.path.isfile(fmri_path):\n", " raise ValueError(\"Data does not seem to be downloaded (correctly).\\n\"\n", " \"Try removing the data and re-downloading it!\")\n", "fmri = nib.load(fmri_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Do you remember how to inspect Nifti1Images from nibabel? Try to figure out this scan's TR and how many volumes (timepoints) this file has. Store the TR in a variable named tr_todo and the number of volumes in nvol_todo. \n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "ec7a4cf44a1cd1831d238e580bec06b6", "grade": false, "grade_id": "cell-16955c8dbdb9792d", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "60778f16d0e091c9e897bee5873d0b09", "grade": true, "grade_id": "cell-00236678c968f01c", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the ToDo above. '''\n", "assert(tr_todo == 1)\n", "assert(nvol_todo == 453)\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### TSV files (or: a short introduction to Pandas)\n", "*(If you're familiar with the Pandas package, you may skip this section.)*\n", "\n", "The other type of file you'll probably encounter a lot when working with Nilearn is a TSV (tab-separated values) file. This plain-text file is like a spreadsheet with different \"observations\" in rows and different \"attributes\" in columns. For example, according to the Brain Imaging Data Structure ([BIDS](https://bids.neuroimaging.io/)), information about your experiment (like trial onsets, durations, conditions, etc.) should be stored in a TSV file ending in _events.tsv. \n", "\n", "As expected, the data we downloaded also contains such an event-file, sub-001_task-MGT_run-01_events.tsv. (Note that these event-files are stored in the \"unprocessed\" data directory, not the \"derivatives\" directory).\n", "\n", "These TSV files can be loaded into Python using the [pandas](https://pandas.pydata.org/) package. pandas is usually imported as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, TSV files (or any x-delimited files, like CSV files) can be loaded in using the pd.read_csv function. Let's do that for our events-file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = os.path.join(save_dir, 'sub-001', 'func', 'sub-001_task-MGT_run-01_events.tsv')\n", "events_df = pd.read_csv(path, sep='\\t')\n", "events_df # putting events_df at the end of a cell (instead of printing it) will show a nicely formatted table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: What do you think the argument sep='\\t' does? Try removing the argument to see what happens.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the pd.read_csv function returns a table-like object with 64 rows (corresponding to the experiment's events/trials) and 6 columns (corresponding to different attributes of the trials, like onset, duration, etc.). The events_df is a custom Pandas object called a DataFrame:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(type(events_df))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas DataFrames are similar to dataframes in the R programming language. A full introduction to Pandas is beyond the scope of this tutorial (see [10 minutes to pandas](https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html) for a nice introduction), but we'll highlight some useful Pandas-related functionality below.\n", "\n", "One thing that is useful to know is that DataFrames have both row and column names. The row names are usually referred to as the DataFrame's *index* (these are the elements left of the \"onset\" column, i.e., 0, 1, 2, ... 63). Row and column names can be of any type (strings, like the \"onset\" column, integers, like the row names in this DataFrame, etc.).\n", "\n", "To select particular rows and/or columns, you can use the .loc and .iloc methods, where the .loc method selects based on the *name* of the row/column (like a dictionary) and the .iloc method selects based on the *position*(i.e., the \"number\") of the row/column (like a list).\n", "\n", "The syntax of a loc and iloc based selection is as follows:\n", "\n", "\n", "df.loc[row_name, col_name]\n", "df.iloc[row_index, col_index]\n", "\n", "\n", "Note that the square brackets ([]) are different than what you expect from a *method*, but the technicalities are beyond the scope of this tutorial.\n", "\n", "Anyway, to for example select the onset column (and all rows), we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the : tells loc to select *all* the rows\n", "events_df.loc[:, 'onset']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And to select the first ten rows (and all columns), we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Just like regular Python lists, iloc accepts slice syntax (like :10, or 5:12:2)\n", "events_df.iloc[:10, :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToThink In the example above, we used iloc to select the first ten rows of the DataFrame, but we could have also used loc for this purpose (try it out yourself by substituting loc for iloc). Why do you think this is the case? Is this always possible?\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you can also select multiple rows and/or columns at the same time by passing a list or tuple to the loc and iloc indexers:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "events_df.loc[:, ['onset', 'duration']]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "events_df.iloc[[0, 5], :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Let's practice a little. Select from events_df all odd rows in a single statement and store it in a variable named odd_rows. For a refresher on slicing in Python, see here).\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "3e33a99ad00a33622cb18922a89a1761", "grade": false, "grade_id": "cell-4d06bc9fbb0f608b", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement the ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "a99a3dca034675aa771e72ec506a0ec0", "grade": true, "grade_id": "cell-105f8956410bae96", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the ToDo above. '''\n", "import numpy as np\n", "assert(odd_rows.shape == (32, 6))\n", "np.testing.assert_array_equal(odd_rows.index, range(1, 64, 2))\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Select the last 10 rows and the onset and duration columns in a single statement and store it in a variable named last10_onset_duration.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "3bb06cdf88f1cab6bf4cd100f3ecfc06", "grade": false, "grade_id": "cell-82977cc39029dd20", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement the ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "f5eabc7a645020ac318e139bfbb3b892", "grade": true, "grade_id": "cell-85da9616a043e61f", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "assert(last10_onset_duration.shape == (10, 2))\n", "np.testing.assert_array_equal(last10_onset_duration.columns, ['onset', 'duration'])\n", "np.testing.assert_array_equal(last10_onset_duration.index, range(54, 64))\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another useful approach for selecting subsets of observations (i.e., rows) in DataFrames is *boolean indexing*. For example, to create an index that selects all trials (rows) in which the participant indicated \"strongly_accept\" (i.e., accepting a trial's proposed bet), we can do:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bool_idx = events_df.loc[:, 'participant_response'] == 'strongly_accept'\n", "print(bool_idx)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and we can subsequently pass this index to loc (note that iloc doesn't work with boolean indices):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "strongly_accept_trials = events_df.loc[bool_idx, :]\n", "strongly_accept_trials" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Using boolean indexing, select all trials with a reaction time smaller than 1.5 seconds and store the result in a variable named rt_smaller_than_1p5.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "51c7736955c7269bcd1ccb0b3a65d040", "grade": false, "grade_id": "cell-45ee7cfb355442a2", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement the ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "327a868492a0b5801f5ba007952f6a46", "grade": true, "grade_id": "cell-4dbade100917273f", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "np.testing.assert_almost_equal(rt_smaller_than_1p5.loc[:, 'RT'].mean(), 1.2196)\n", "print('Well done!')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also use loc and iloc to create new columns. For example, let's add a new column, \"random_numbers\", with some random numbers:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Note that you can access the shape (nr of rows, nr of cols) of a DataFrame using\n", "# the .shape attribute (just like numpy arrays)!\n", "rnd = np.random.randn(events_df.shape)\n", "events_df.loc[:, 'random_numbers'] = rnd\n", "events_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To delete a row or column, you can use the drop method. Note that you should pass a particular axis to drop, either axis=0 (should I drop rows?) or axis=1 (should I drop columns?):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "events_df = events_df.drop('random_numbers', axis=1)\n", "events_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are many, many more things you can do with Pandas, but for now, this should suffice for our purposes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating design matrices\n", "Before we are going to fit any model on our fMRI data ($y$), we are going to focus on what we're going to put into the model: the design matrix ($\\mathbf{X}$)! Note that Nilearn offers different ways to create design matrices (and fit univariate models in general), some offer a relatively \"high-level\" interface (such as first_level_model_from_bids) while others only offer the building blocks (such as different HRF models and various types of regression models) that you can use to build the models yourself. \n", "\n", "These multiple interfaces for the same functionality is especially apparent when constructing design matrices using Nilearn. In this section, we'll start at the lowest level and work ourselves up to the more convenient (but less flexible) higher level interfaces.\n", "\n", "That said, let's go back to design matrices. What do you need to construct them? Roughly, you need the following:\n", "* A particular HRF model\n", "* Information about event onset, duration, and (optionally) amplitude (weight);\n", "* Information about fMRI scan timing;\n", "\n", "Using these three components, we can construct HRF-informed regressors for our events on the time scale of our fMRI signal. \n", "\n", "### Defining an HRF model\n", "First, let's try to explicitly construct an HRF model using Nilearn. The nilearn.glm.first_level.hemodynamic_models module contains various HRF models. Let's start with the good old canonical \"Glover\" HRF:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.glm.first_level.hemodynamic_models import glover_hrf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This glover_hrf function takes a couple of arguments, most importantly: tr (temporal resolution of your fMRI scan in seconds) and oversampling (how much to temporally upsample the HRF relative to your tr; you can usually safely ignore the other two arguments, time_length and onset).\n", "\n", "Suppose that we want to define our HRF on the scale of 0.01 seconds, and knowing that our fMRI scan has a TR of 1, we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "canon_hrf = glover_hrf(tr=1, oversampling=100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And let's plot it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "# Define timepoints corresponding to HRF\n", "t_hrf = np.linspace(0, 32, 32 * 100, endpoint=False)\n", "\n", "plt.figure(figsize=(11, 4))\n", "plt.plot(t_hrf, canon_hrf)\n", "plt.grid()\n", "plt.ylabel('Amplitude (A.U.)', fontsize=15)\n", "plt.xlabel('Time (seconds)', fontsize=15)\n", "plt.title('Canonical (\"Glover\"\") HRF', fontsize=20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Sometimes, people like to use a basis set of functions as a model for the HRF, such as the temporal and dispersion derivatives of the canonical HRF (in addition to the canonical HRF itself). The nilearn package also includes these temporal and dispersion derivatives: glover_time_derivative and glover_dispersion_derivative. Import these functions, construct these HRF derivatives, and plot all three together in the same figure (like the one above).\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "f2fb688074cd99284c892ebf9a87cf69", "grade": false, "grade_id": "cell-e2ed2ec94f72a41d", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. (No test cell.) '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating HRF-convolved regressors\n", "Now, to create actual regressors, we need our event onsets and durations. At the lowest level interface, we only need the onsets, durations, and amplitudes (weights) of our events of a particular condition. For now, let's focus on the \"strongly_accept\" condition (note that you can define a \"condition\" using whatever feature from the experiment, such as the gains or losses, or even reaction time). There is no reason to focus only on these type of trials other than for the sake of the example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "idx = events_df.loc[:, 'participant_response'] == 'strongly_accept'\n", "sa_events = events_df.loc[idx, ['onset', 'duration']]\n", "sa_events['amplitude'] = 1 # set the same amplitude for each event\n", "sa_events" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step is to create an event regressor (basically arrays with zeros, containing ones at the onset of events) and convolving this array with our HRF model. We can do this easily using the compute_regressor function from the nilearn.glm.first_level.hemodynamic_models module:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.glm.first_level.hemodynamic_models import compute_regressor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most important arguments of this function are the following:\n", "* exp_condition: an array of shape $3$ (onset, duration, amplitude) $\\times N$ (events for a particular condition)\n", "* hrf_model: a **string** indicating the HRF model ('spm', 'spm + derivative', 'spm + derivative + dispersion',\n", "'glover', 'glover + derivative', 'glover + derivative + dispersion', 'fir')\n", "* frame_times: an array of shape $T$ (timepoints of fMRI series) containing the acquisition time of each fMRI volume\n", "\n", "First, let's define exp_condition by pulling out the array from sa_events (using the .values attribute of the DataFrame) and transposing it (otherwise it would be of the wrong shape, i.e., $N \\times 3$)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "exp_condition = sa_events.values.T\n", "print(exp_condition.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's for simplicity just use a canonical (Glover) HRF:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hrf_model = 'glover'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we'll define the frame times relative to the middle slice (i.e., assuming for simplicity that each slice was recorded at TR / 2):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_vols = 453\n", "tr = 1\n", "frame_times = np.linspace(tr / 2, n_vols * tr + tr / 2, n_vols, endpoint=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, now we have everything to compute the regressor using compute_regressor:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The function compute_regressor returns two things:\n", "# the convolved regressor(s) and a (default) name(s)\n", "reg, name = compute_regressor(\n", " exp_condition=exp_condition,\n", " hrf_model=hrf_model,\n", " frame_times=frame_times\n", ")\n", "\n", "plt.figure(figsize=(11, 4))\n", "plt.plot(frame_times, reg)\n", "plt.grid()\n", "plt.ylabel('Amplitude (A.U.)', fontsize=15)\n", "plt.xlabel('Time (seconds)', fontsize=15)\n", "plt.title('HRF-convolved regressor', fontsize=20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the analyses of NARPS project was to investigate the parametric effect of *gains* (i.e., the amount that could be gained for this trial), effectively asking: which voxels show activity that linearly relates to gains? For this analysis, you'd need both a simple \"gamble-vs-baseline\" regressor (like we computed before, but then for all trials instead of just the \"strongly_accept\" ones) and a \"parametric modulation\" regressor. This parametric modulation regressor uses the same onsets and durations as the \"trial-vs-baseline\" regressor, but *also* includes varying modulation values that indicate how much the effect of the gamble is modulated. Importantly, these modulation values should be mean-centered (or \"demeaned\"), i.e., its mean should be zero. You can mean-center any variable $x$ by subtracting the mean ($\\bar{x}$) from all values:\n", "\n", "\\begin{align}\n", "x_{\\mathrm{demeaned}} = x - \\bar{x} \n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo (optional): Create a parametric regressor that is modulated by the \"gain\" values. You can do this by making sure that the third row in the array that is passed as the exp_condition argument of compute_regressor reflects the mean-centered gain values. Store the resulting regressor in a variable named parametric_gain_reg.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "80baf5734f6424514e03733fc0b2690a", "grade": false, "grade_id": "cell-9ef920b001211fc1", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "57aee5a122369a26832140ef0638e4ef", "grade": true, "grade_id": "cell-04f57e212837609a", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "np.testing.assert_almost_equal(parametric_gain_reg.mean(), 0.0095463)\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using the high-level make_first_level_design_matrix function\n", "While the HRF-related functions and compute_regressor already take care of a lot of stuff for us, you might still think that constructing a complete design matrix this way is quite cumbersome. Fortunately, Nilearn has a more high-level function for constructing complete design matrices easily called make_first_level_design_matrix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.glm.first_level.design_matrix import make_first_level_design_matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: in Jupyter notebooks, you can view any (Python) function's documentation, or \"docstring\", by running the function (without brackets) followed by a question mark, e.g., make_first_level_design_matrix?. Try this below. You can remove the pop-up by clicking the cross-symbol in the upper right-corner.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "make_first_level_design_matrix?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, this function needs (amongst other things) a DataFrame (events) containing the following columns: \"onset\", \"duration\", \"trial_type\", and (optionally) \"modulation\". Then, the function will basically apply the compute_regressor function across the different trial types as defined in the \"trial_type\" column. \n", "\n", "For now, let's assume that we'd want to create a design matrix with a different (un-modulated) regressor for each of the participant responses (\"weakly_accept\", \"strongly_accept\", \"weakly_reject\", and \"strongly_reject\"). Below, we'll create the appropriate DataFrame for this purpose:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# First, extract relevant columns\n", "pr_events = events_df.loc[:, ['onset', 'duration', 'participant_response']]\n", "\n", "# Then, rename the participant_response column to trial_type\n", "# bonus: why do you think we have to specify axis=1 here?\n", "pr_events = pr_events.rename({'participant_response': 'trial_type'}, axis=1)\n", "\n", "# Lastly, create a \"modulation\" column with only ones (indicating no modulation)\n", "pr_events.loc[:, 'modulation'] = 1 # this automatically fills all rows with 1\n", "pr_events" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, the different conditions in our design are defined by the unique values in our trial_type column:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(pr_events.loc[:, 'trial_type'].unique())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ah darn, we also have 'NoResp' trials!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo Remove all 'NoResp' trials (i.e., rows) and store the new (filtered) DataFrame in a variable named pr_events_filt.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "2d60c1343ddfc479af2c662ff083b9e7", "grade": false, "grade_id": "cell-74e38deb511b7169", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "0bed47b0683b7de48a6dde0838085103", "grade": true, "grade_id": "cell-883e26bd8d809699", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "assert('NoResp' not in pr_events_filt.loc[:, 'trial_type'].unique())\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo Now you're ready to use make_first_level_design_matrix! Use the appropriate arguments (at least frame_times, events, and hrf_model) and make sure to use drift_model=None.\n", " This will prevent Nilearn from automatically adding a set of regressors to the design matrix which function as a high-pass filter (we'll take a look at this later). Store the output of the function in a new variable called dm, which should be a new pandas DataFrame of shape T (number of timepoints) x P (number of regressors, i.e., number of conditions + intercept). You can ignore the UserWarning.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "89ec25d878d380b5881817253d115691", "grade": false, "grade_id": "cell-fe393db3e8d7b58b", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "d4e305ff8942396c4514cae24b4120b0", "grade": true, "grade_id": "cell-aa99d9248baeaa9f", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the ToDo above. '''\n", "assert(dm.shape == (453, 5))\n", "np.testing.assert_array_equal(\n", " dm.columns,\n", " ['strongly_accept', 'strongly_reject', 'weakly_accept', 'weakly_reject', 'constant']\n", ")\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, almost done with this section. One last thing we want to show you is the plot_design_matrix function, which you can use to, well, plot the design matrix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.plotting import plot_design_matrix\n", "dm = pd.read_csv('dm_todo.tsv', sep='\\t')\n", "plot_design_matrix(dm);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This plot shows you a \"color-coded\" version of your design matrix, with time on the y-axis and the different predictors on the x-axis. Note that the brighter the color (i.e., more yellow), the higher that regressor's value at that timepoint." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo Run make_first_level_design_matrix again, but add the following arguments: drift_model='cosine' and high_pass=0.01 (i.e., in Hertz, corresponding to a cutoff of 100 seconds), which will add a set of cosine regressors to the design matrix, which function as a high-pass filter for our fMRI data. Store the design matrix in a new variable named dm_with_hp. Then, plot the design matrix again to see how it looks like!\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "658237be83a52a24d613154623543fe7", "grade": false, "grade_id": "cell-5e5baed4c109d590", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "403b355d260b49c9352d6e3864b7e60b", "grade": true, "grade_id": "cell-6639ede25537fe5b", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "assert(dm_with_hp.shape == (453, 14))\n", "assert(all(f'drift_{i}' in dm_with_hp.columns for i in range(1, 10)))\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## First-level models\n", "In this section, we will talk about fitting first-level models and computing contrasts from their estimated parameters.\n", "\n", "### Constructing and fitting first-level models\n", "At last, we can start fitting first-level GLM models! Like with constructing design matrices, there are different ways to go about defining and fitting first-level models in Nilearn:\n", "* Using the low-level run_glm function;\n", "* Using the FirstLevelModel class;\n", "* Using the first_level_models_from_bids function\n", "\n", "For most purposes, we recommend using the FirstLevelModel interface, which strikes a nice balance between flexibility (unlike the very high-level first_level_models_from_bids function) and ease-of-use (unlike the very low-level run_glm function).\n", "\n", "Let's start by importing the FirstLevelModel class (you may ignore the FutureWarning):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# You may ignore the DeprecationWarning\n", "from nilearn.glm.first_level import FirstLevelModel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that, unlike the previous functionality from Nilearn that we discussed, FirstLevelModel is not a *function*, but a custom *class* (often called a *type* in Python):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(type(FirstLevelModel))\n", "\n", "# Contrast this with for example 'make_first_level_design_matrix'\n", "print(type(make_first_level_design_matrix))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using custom classes, you can instantiate (\"create\") objects that behave as specified in the class. Before we go on, let's digress a little and discuss briefly how custom classes and object work.\n", "\n", "One way to think about classes and objects is to see classes as *building blueprints* and the objects constructed from them as the *actual buildings*. For example, neuroimaging software in Python often work with MRI data stored in objects from the Nifti1Image class (from the nibabel package). An object is usually *constructed* from a class as follows:\n", "\n", "\n", "some_object = SomeClass(optional_arg1, optional_arg2, optional_arg3, etc.)\n", "\n", "\n", "For example, to construct a Nifti1Image object, we need to pass it both data (as a numpy array) and an affine (also as a numpy array):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "# Very small hypothetical brain image of 10x10x10\n", "data = np.random.normal(0, 1, size=(10, 10, 10))\n", "affine = np.eye(4)\n", "\n", "# Here, we construct an object (nifti_obj) from a class (nib.Nifti1Image)\n", "nifti_obj = nib.Nifti1Image(data, affine)\n", "print(type(nifti_obj))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The class (\"blueprint\") contains instructions about the \"properties\" of the object (\"building\") and the \"functionality\" of the object. The \"properties\" of an object are, technically, called *attributes* and can be accessed as: some_object.attribute_name. For example, Nifti1Image objects have an attribute named shape:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(nifti_obj.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The \"functionality\" of objects are, technically, called *methods*. You can see methods as functions that are bound to a particular object. For example, Nifti1Image objects hava a method called get_fdata, which extract the actual brain data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = nifti_obj.get_fdata()\n", "# 'data' is the underlying numpy array!\n", "print(type(data))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you should call methods in the same way as functions (i.e., using round brackets, (), which may or may not contain arguments). Like functions, method calls may (or may not) return one or more things." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: in Jupyter notebooks (and many other code editors), you can inspect an object's attributes and methods by typing the object's variable name followed by a dot and pressing tab, i.e., your_object.+TAB. Try it out below with the nifti_obj variable to inspect the different attributes/methods from the Nifti1Image class.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Try out the tip here\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This topic of (custom) classes and objects is actually very important in Python. Technically, Python is an \"object-oriented\" programming language in which *everything is an object (from a particular class)*! But an in-depth explanation of object-oriented programming is beyond the scope of this tutorial. For now, knowing the basics about (custom) classes and objects suffices for our purposes.\n", "\n", "Alright, back to the FirstLevelModel class. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Check out the arguments need for the initialization of a FirstLevelModel object below.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "FirstLevelModel?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, initializing a FirstLevelModel object takes a lot of arguments (t_r, slice_time_ref, hrf_model, drift_model, etc.). Fortunately, many arguments have sensible defaults. In what follows, we'll go through the most important arguments step by step, after which we'll (finally!) construct a FirstLevelModel object.\n", "\n", "The t_r argument refers to the TR (temporal resolution, in seconds) of your fMRI scan." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t_r = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The slice_time_ref argument refers to which slice you want resample your design matrix (this is used internally to define the frame_times), which should be between 0 (first slice) and 1 (last slice). For example, if you slice-time corrected your fMRI data to the first slice, you should set slice_time_ref to 0 (middle slice → 0.5, last slice → 1, etc.). If you did not apply any slice-time correction (as is the case for the NARPS data), we recommend setting it to 0.5. (Can you think of a reason why this is a sensible choice for non-slice-time-corrected data?)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slice_time_ref = 0.5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While you can construct your design matrix using Nilearn functionality directly, you can actually also leave it up to the FirstLevelModel to do this for you! That's why you might have recognized some parameters that you've seen in functions from the previous section (such as hrf_model, drift_model, and high_pass). For now, we'll let the FirstLevelModel take care of constructing a design for us, so we'll define these parameters here:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hrf_model = 'glover'\n", "drift_model = 'cosine'\n", "high_pass = 0.01 # cutoff: 0.01 Hz (i.e., 100 seconds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next important parameter is mask_img (note that we're skipping some parameters, as they represent some more advanced functionality/analyses). This (optional!) parameter refers to a nifti image (a path to a nifti file or a Nifti1Image object) with binary (0, 1) data indicting which voxels should be included (1) and which ones should be \"masked\" (0). Fortunately, Fmriprep also returns a brain-mask:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mask_img = os.path.join(\n", " save_dir, 'derivatives', 'fmriprep', 'sub-001', 'func',\n", " 'sub-001_task-MGT_run-01_bold_space-MNI152NLin2009cAsym_brainmask.nii.gz'\n", ")\n", "print(mask_img)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that including a brain-mask may substantially speed up your analysis (as it reduces the number of voxels that will be analyzed).\n", "\n", "We can also spatially smooth our fMRI data by setting the smoothing_fwhm parameter (in millimeters):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smoothing_fwhm = 3.5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, the standardize and signal_scaling parameters refer to the method that will be used to mean-center the data. In this example, we'll leave it to the default values (standardize=False and signal_scaling=0), which will only mean-center the data in the time domain (i.e., substract the mean across time from each voxel).\n", "\n", "The noise_model parameter refers to the particular noise model of the GLM. Do we assume independent and identically distributed (IID) noise (noise_model='ols') or do we assume that our noise is temporally autocorrelated (noise_model='ar1')? For almost all fMRI data, we should assume that the noise is at least somewhat autocorrelated, so we'll set noise_model to 'ar1'. (Note that 'ar1' is a bit slower than 'ols', so you may consider using 'ols' when for example testing your code.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "noise_model = 'ar1'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, the n_jobs parameter determines how many CPU cores the GLM analysis will use. For now, one core should suffice:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_jobs = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lastly, you may want to set the minimize_memory argument to False. While doing this will increase the memory (RAM) necessary for the GLM analysis, it allows use to retrieve more outputs from the GLM after fitting (such as model fit, $R^2$)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "minimize_memory = False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we're *finally* ready to constuct our FirstLevelModel object!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flm = FirstLevelModel(\n", " t_r=t_r,\n", " slice_time_ref=slice_time_ref,\n", " hrf_model=hrf_model,\n", " drift_model=drift_model,\n", " high_pass=high_pass,\n", " mask_img=mask_img,\n", " smoothing_fwhm=smoothing_fwhm,\n", " noise_model=noise_model,\n", " n_jobs=n_jobs,\n", " minimize_memory=minimize_memory,\n", " verbose=True # this will print out some useful info later\n", ")\n", "\n", "\"\"\" Note that, here, we've defined all arguments beforehand,\n", "but this is not necessary! We could have done something like:\n", "flm = FirstLevelModel(\n", " t_r=1,\n", " slice_time_ref=0.5,\n", " hrf_model='glover',\n", " drift_model='cosine',\n", " high_pass=0.01,\n", " mask_img=os.path.join(etc., etc.),\n", " smoothing_fwhm=3.5,\n", " noise_model='ar1',\n", " n_jobs=1,\n", " minimize_memory=False,\n", " verbose=True\n", ")\n", "\"\"\";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wait, what? Did it really just take a fraction of a second to fit the first-level model? Actually, no. We only *constructed* the first-level model, but we haven't fitted it yet! To do so, we can use the, guess what, fit method!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: check out the arguments needed by the fit method. \n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flm.fit?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are actually two ways of using the fit method, depending on which parameters you use:\n", "1. define events (+ optionally confounds)\n", "2. define design_matrices\n", "\n", "Approach 1 will construct a design matrix for you (by running make_first_level_design_matrix on your events and, optionally, concatenating this with your confounds) while approach 2 assumes that you have constructed a design matrix yourself.\n", "\n", "Note, by the way, that the parameter names are all plural (run_img**s**, event**s**, confound**s**, design_matrice**s**). This is because you can either fit a model on a *single* fMRI run or fit a model on *multiple* fMRI runs with a single fit call. To fit multiple runs, just pass a list instead of a single object for each parameter (e.g., design_matrices=[dm_run1, dm_run2] instead of design_matrices=dm). \n", "\n", "Regardless of the approach for constructing your design matrix, the fit method always needs (one or more) fMRI files (for the run_img parameter), which can either be a Nifti1Image object or string representing the path to a 4D nifti file. Furthermore, both the events and the confounds should be a Pandas DataFrame. For our events, let's use the pr_events_filt events, in which the participant responses represent our conditions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This is also one possible solution to the ToDo earlier\n", "events2use = pr_events.loc[pr_events.loc[:, 'trial_type'] != 'NoResp', :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Almost always, you'd want to add some confounds to your design matrix, as confounds represent independent variables that are (often) not of interest to the researcher, but might be related to the signal (the depenent variable, $y$) anyway. Including these confounds in your design matrix will then explain some variance that might otherwise not be accounted for (which will \"end up\" in the model's noise term, $\\hat{\\sigma}^2$), increasing the model's power to detect effects related to the variables you're interested in. Note that including the \"right\" confounds in your model is in no way trivial (and opinions differ to what confounds are the \"right\" ones to include)!\n", "\n", "That said, fortunately, Fmriprep also computes an extensive set of timepoint-by-timepoint confounds for each fMRI run, which are conveniently stored in TSV files. (You can really see that Nilearn was designed with Fmriprep's outputs in mind.) We'll load these confounds below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conf_path = os.path.join(save_dir, 'derivatives', 'fmriprep', 'sub-001', 'func', 'sub-001_task-MGT_run-01_bold_confounds.tsv')\n", "confounds = pd.read_csv(conf_path, sep='\\t')\n", "confounds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Read the Fmriprep documentation about the different confounds (scroll down to the \"Confounds\" section). Note that the confound names in the documentation are slightly different that in our DataFrame because this dataset was preprocessed using an older version of Fmriprep.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As stated in Fmriprep's documentation, it's probably not a good idea to include *all* confounds in our model. Selecting a subset of confound parameter is, however, also not a trivial matter; which confounds are appropriate and explain to most variance in your data likely depends on the characteristics of your particular dataset (field strength, scan technique, the amount of motion in your data, etc.) and your intended analysis (univariate group-level GLM, functional connectivity, MVPA).\n", "\n", "Here, we'll be relatively conservative and include only the 6 motion parameters (X, Y, Z, RotX, RotY, RotZ) and the non-steady-state outlier (NonSteadyStateOutlier00; i.e., a binary-coded regressor removing the influence of volumes at the beginning of a run that contain too much T1-contrast)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "confounds2use = confounds.loc[:, ['X', 'Y', 'Z', 'RotX', 'RotY', 'RotZ', 'NonSteadyStateOutlier00']]\n", "confounds2use" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In fact, we can reuse the plot_design_matrix function to visualize this \"confound design matrix\":" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ax = plot_design_matrix(confounds2use)\n", "# Rescale the colors a little\n", "ax.get_images().set_clim(0, 0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, now we can *finally* fit our model (this may take 2-5 minutes or so, depending on how fast your computer/server is; go get some coffee)!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flm.fit(run_imgs=fmri, events=events2use, confounds=confounds2use)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The fit function doesn't return the results of the GLM analysis, but stores it as attributes. This includes the constructed design matrix (or matrices, when there is more than one run):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the design_matrices_ attribute is a list with, in our case, only a single\n", "# element\n", "flm_dm = flm.design_matrices_\n", "\n", "# Let's plot it again\n", "ax = plot_design_matrix(flm_dm)\n", "ax.get_images().set_clim(0, 0.2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another result from the GLM that we can inspect is model fit, expressed as $R^2$ (only when minimize_memory=False):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Again, it returns a list, and we'll take the first element,\n", "# because we only have one run\n", "r2_img = flm.r_square" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The r2_img variable is a 3D Nifti1Image object with voxelwise $R^2$ values. We can visualize this using the plotting module from Nilearn (e.g., using plot_stat_map with an arbitrary threshold of 0.2):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn import plotting\n", "plotting.plot_stat_map(r2_img, threshold=0.2);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another thing we can inspect is the residual time series from the model fit, which we can retrieve using the residuals attribute (you can ignore the FutureWarning):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "resids = flm.residuals\n", "print(type(resids))\n", "print(resids.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo (optional): If you want to brush up your Nilearn skills, try computing the standard deviation across time for each voxel in the resids variable. Store the result (a Nifti1Image) in a new variable called r_std. Then, plot the result using plot_stat_map using a threshold of 3. You should clearly see the brain's veins and contours (effects of movement?) in the plot! Hint: perhaps the function nilearn.image.math_img is of use here ... (but this ToDo can be implemented in various ways)\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "054a955d7a34448c1f05c1d52acac0a9", "grade": false, "grade_id": "cell-db2f5d52cf9ec3f2", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement the (optional) ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "8d8796be73f500c9b7285586cf78297f", "grade": true, "grade_id": "cell-0cc161bc63e4b5a6", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "np.testing.assert_almost_equal(r_std.get_fdata().mean(), 0.34867, decimal=5)\n", "print(\"Well done!\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's remove the residuals variable, which uses a lot of memory\n", "del flm.residuals, resids" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Computing contrasts from first-level models\n", "The last thing we'll discuss in this section is computing contrasts and thresholding the resulting statistical images. Computing contrasts using FirstLevelModel objects is ridiculously easy. This is done using the compute_contrast method." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Check out the arguments needed for the compute_contrast method. \n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flm.compute_contrast?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The contrast_def parameter represents the contrast you want compute and can be used in different ways. One way is to specify a contrast vector (a list or a numpy array). For example, if we'd like to evaluate the contrast \"strongly_accept > baseline\", we could specify the contrast vector as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "con_vec = np.zeros(flm.design_matrices_.shape)\n", "con_vec = 1\n", "print(con_vec)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that values of the contrast vector are assumed to relate to the regressors froms the design matrix in the same order (e.g., the first value from the contrast vector corresponds to the first regressor in your design matrix, i.e., the first column).\n", "\n", "Then, we can compute the contrast as following (assuming a single contrast, i.e., stat_type='t', and wanting $z$-values as output, i.e., output_type='z_score'):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "con_img = flm.compute_contrast(con_vec, stat_type='t', output_type='z_score')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How does that look like?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plotting.plot_stat_map(con_img)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same contrast can be evaluated by using a string describing a \"formula\". In the \"strongly_accept > baseline\" contrast, this is trivial:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "con_img2 = flm.compute_contrast('strongly_accept', stat_type='t', output_type='z_score')\n", "# check that it is literally the same; if it's not, it will give an error\n", "np.testing.assert_array_equal(con_img.get_fdata(), con_img2.get_fdata())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But we can also define more complicated contrasts, such as \"(strongly_accept + weakly_accept) > strongly_reject\" (note that this is theoretically a nonsense contrast and just used for the sake of the example):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Note: the same contrast can be defined using the contrast vector:\n", "# [1, -2, 1, 0, 0, 0, 0, ..., 0]\n", "contrast_def = 'strongly_accept + weakly_accept - 2*strongly_reject'\n", "con_img = flm.compute_contrast(contrast_def, stat_type='t', output_type='z_score')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that $z$-values are not the only output type possible. The other options are:\n", "* 'stat': either the $t$-value or $F$-value, depending on stat_type;\n", "* 'p_value': the $p$-value associated with the $t$ of $F$-value;\n", "* 'effect_size': the dot product of the contrast vector and the parameters ($c\\hat{\\beta}$);\n", "* 'effect_variance': the variance of the dot product of the contrast vector and the parameters ($\\mathrm{var}[c\\hat{\\beta}]$);\n", "* 'all': all of the above" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Using either the \"formula\" format or the contrast vector format, compute the $t$-values associated with the contrast \"reject > accept\" (regardless of strongly/weakly). Store the resulting image in a variable named reject_gt_accept (gt = greater than) and plot the image using plot_stat_map.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "f77849adb318b8c7e9275594603497d4", "grade": false, "grade_id": "cell-e0746855f21dab59", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "f0bb6bf87cfd61b62e1edc8ad269881b", "grade": true, "grade_id": "cell-6247afe9a7e31631", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "np.testing.assert_almost_equal(reject_gt_accept.get_fdata().mean(), 0.04879, decimal=5)\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So far, we plotted the resulting statistic maps without a threshold (or an arbitrary one), but in a proper analysis, you'd want to statistically threshold your image to reduce the risk of false positives. In Nilearn, this can be done using the function threshold_stats_img." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.glm.thresholding import threshold_stats_img" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Checkout the arguments needed for the threshold_stats_img function. \n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "threshold_stats_img?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the primary input to the threshold_stats_img function (i.e., stat_img) is assumed to consist of $z$-values. \n", "\n", "Using threshold_stats_img, there are various ways of thresholding your statistical map. Your desired method can be indicated using the height_control parameter (either 'fpr', 'fdr', or 'bonferroni', where 'fpr' refers to simple p-value based thresholding without multiple comparison control). The alpha parameter represents the significance level that should be used. There is also an option to only include significant voxels that belong to a cluster larger than a certain amount of voxels using cluster_threshold (which can make plots a little more visually appealing), but realize that this is by no means a method that guarantees proper false positive control.\n", "\n", "Importantly, the threshold_stats_img function optionally takes a mask (mask_img), which improves sensitivity a lot when using, for example, Bonferroni correction which depends on the number of tests. Also, note that the threshold_stats_img function returns two things: the thresholded map and the threshold that was actually used.\n", "\n", "Anyway, let's focus on the \"strongly_reject > baseline\" contrast and let's threshold it using the 'bonferroni' method with an alpha of 0.01 while included the previously defined mask:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "con_img = flm.compute_contrast('strongly_reject', stat_type='t', output_type='z_score')\n", "con_img_thr, used_threshold = threshold_stats_img(con_img, mask_img=mask_img, height_control='bonferroni', alpha=0.05)\n", "plotting.plot_stat_map(con_img_thr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: threshold the same image (con_img), but this time using 'fdr' correction, an alpha of 0.01 and a cluster threshold of 50. Store the result in a variable named con_img_thr_fdr.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "bc35dcf9bfad4196759900e77ae7a139", "grade": false, "grade_id": "cell-3f892147de5d5942", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "603e61346aa6d72c861c28e5d3de01e6", "grade": true, "grade_id": "cell-18b96be3ae9f3560", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "np.testing.assert_almost_equal(con_img_thr_fdr.get_fdata().max(), 7.47113, decimal=5)\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we go on to the next section, we want to show you one last nugget: the make_glm_report function (see [documentation](https://nistats.github.io/modules/generated/nistats.reporting.make_glm_report.html#nistats.reporting.make_glm_report)). This function takes an *already fitted* FirstLevelModel object, a contrast definition, and optionally several parameters related to computing and thresholding the statistic image(s) and returns a summary of the results (as an HTML file, which can be nicely rendered in Jupyter notebooks)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.reporting import make_glm_report\n", "# This may take a couple of seconds\n", "# You can use your trackpad or keyboard keys to navigate the cell below\n", "# (make sure you are in \"edit\" mode, i.e., the cell border should be blue;\n", "# click the cell to enter \"edit\" mode)\n", "make_glm_report(flm, 'strongly_accept - weakly_accept')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's clear up some memory for the last section\n", "%reset -f" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Second-level models\n", "The only major Nilearn functionality that we haven't discussed is *second-level models*. Unlike first-level models, there is basically only one way to fit second-level models: using the SecondLevelModel class." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from nilearn.glm.second_level import SecondLevelModel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo: Check out the arguments needed for initialization of a SecondLevelModel object.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SecondLevelModel?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second-level model class allows you to run random-effects (but not mixed-effects!) models on multiple participants. Note that it assumes that all data have been registered/resampled to a common space (e.g., MNI, such as our data). It's interface is very similar to the FirstLevelModel class (i.e., it also has a fit and compute_contrast method). When constructing a SecondLevelModel object, the most important parameters are mask_img, an optional mask, and smoothing_fwhm, an optional smoothing kernel.\n", "\n", "For our example, let's do a (horribly underpowered) group-level analysis of the two subjects whose data we downloaded. First of all, we need to fit both first-level models. We'll define a function below that does this for a single subject:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import pandas as pd\n", "import nibabel as nib\n", "from glob import glob\n", "from tqdm.notebook import tqdm\n", "from nilearn import masking\n", "from nilearn.glm.first_level import FirstLevelModel\n", "\n", "def fit_firstlevel(sub, bids_dir, task='MGT', run='01', space='MNI152NLin2009cAsym', \n", " conf_cols=None, **flm_kwargs):\n", " \"\"\" Example function of how you could implement a complete\n", " first-level analysis for a single subject. Note that this is\n", " just one way of implementing this; there may be (much more efficient)\n", " ways to do this.\n", " \n", " Parameters\n", " ----------\n", " sub : str\n", " Subject-identifier (e.g., 'sub-01')\n", " bids_dir : str\n", " Path to BIDS directory (root directory)\n", " task : str\n", " Name of task to analyse\n", " run : str\n", " Name of run to analyze\n", " space : str\n", " Name of space of the data\n", " conf_cols : list (or None)\n", " List of confound names to include; if None, only 6 motion params\n", " are included\n", " **flm_kwargs : kwargs\n", " Keyword arguments for the FirstLevelModel constructor\n", " \n", " Returns\n", " -------\n", " flm : FirstLevelModel\n", " Fitted FirstLevelModel object\n", " \"\"\"\n", " \n", " # If conf_cols is not set, let's use a \"standard\" set of\n", " # motion parameters (translation and rotation in 3 dimensions)\n", " if conf_cols is None:\n", " # Note: in new versions of Fmriprep, these variables are named differently,\n", " # i.e., trans_x, trans_y, trans_z, rot_x, rot_y, rot_z\n", " conf_cols = ['X', 'Y', 'Z', 'RotX', 'RotY', 'RotZ']\n", "\n", " # We assume it's a BIDS formatted dataset with the Fmriprep outputs in\n", " # bids_dir/derivatives/fmriprep\n", " bids_func_dir = os.path.join(bids_dir, sub, 'func')\n", " fprep_func_dir = os.path.join(bids_dir, 'derivatives', 'fmriprep', sub, 'func')\n", " \n", " # Let's find the fMRI files, given a particular space (e.g., T1w)\n", " funcs = sorted(glob(os.path.join(fprep_func_dir, f'*space-{space}*_preproc*.nii.gz')))\n", "\n", " # In this loop, we'll find the events/confounds/masks associated with the funcs\n", " confs, events, masks = [], [], []\n", " for func in funcs:\n", " # First, find the associated mask\n", " # Note, this doesn't work for newer versions of Fmriprep, which uses\n", " # a slightly different naming convention for brainmasks (desc-brain_mask)\n", " mask_path = func.replace('preproc', 'brainmask')\n", " masks.append(mask_path)\n", "\n", " # Find the associated confounds file\n", " conf_path = func.replace(f'space-{space}_preproc.nii.gz', 'confounds.tsv')\n", " conf_df = pd.read_csv(conf_path, sep='\\t').loc[:, conf_cols]\n", " confs.append(conf_df)\n", " \n", " # Find the associated events file\n", " event_path = os.path.join(bids_dir, sub, 'func', f'{sub}_task-{task}_run-{run}_events.tsv')\n", " event_df = pd.read_csv(event_path, sep='\\t')\n", " \n", " # Exclude 'NoResp' trials (not strictly necessary)\n", " event_df = event_df.query(\"participant_response != 'NoResp'\")\n", " \n", " # Set participant_response as the trial_type\n", " event_df = event_df.rename({'participant_response': 'trial_type'}, axis=1)\n", " events.append(event_df)\n", "\n", " # In case there are multiple masks, create an intersection;\n", " # if not, this function does nothing\n", " mask_img = masking.intersect_masks(masks, threshold=0.8)\n", "\n", " # Construct the first-level model!\n", " # We set the t_r to the first func we have, assuming\n", " # that the TR is the same for each run (if there are multiple runs)\n", " flm = FirstLevelModel(\n", " t_r=nib.load(func).header['pixdim'],\n", " slice_time_ref=0.5,\n", " mask_img=mask_img,\n", " **flm_kwargs\n", " )\n", " \n", " # Finally, fit the model and return the fitted model\n", " flm.fit(run_imgs=funcs, events=events, confounds=confs)\n", " return flm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can create a loop across our two subjects, estimating a first-level model for each:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bids_dir = os.path.join(os.path.expanduser('~'), 'NARPS')\n", "flms = []\n", "\n", "# This may take about 2-10 minutes, go get some coffee!\n", "for sub in tqdm(('sub-001', 'sub-003')):\n", " flm = fit_firstlevel(sub, bids_dir, drift_model='cosine', high_pass=0.1)\n", " flms.append(flm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, so now we have two fitted FirstLevelModel objects (stored in flms). We will use these for our second-level model. Like FirstLevelModel objects, the SecondLevelModel object has a fit method, but it is a bit more complicated, because there are different ways to specify the input for second-level models (i.e., the second_level_input parameter):\n", "1. a list of fitted FirstLevelModel objects (easy, but limited to within-subject/run analyses only);\n", "2. a list of first-level contrast images (i.e., with $c\\hat{\\beta}$ values);\n", "3. a pandas DataFrame with information about the lower-level inputs (useful when using first-level output from other packages, such as FSL, but for Nilearn-only analyses, this is not very useful)\n", "\n", "The first method is quite efficient (in terms of code), because it assumes that you are only interested in an intercept-only group-level model (i.e., the average of a particular first-level contrast across participants), and you can leave out the second-level design matrix. We'll show this below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We don't supply any mask for simplicity\n", "slm_method1 = SecondLevelModel(smoothing_fwhm=5)\n", "slm_method1.fit(flms) # we don't have give it a design matrix!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second option is to give the SecondLevelModel a list of first-level contrast images in combination with a second-level design matrix. First, let's compute a first-level contrast (say \"strongly_accept\") for both participants:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fl_cons = []\n", "for flm in flms:\n", " con = flm.compute_contrast('strongly_accept', stat_type='t', output_type='effect_size')\n", " fl_cons.append(con)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we only need to construct a second-level design matrix, which should be a pandas DataFrame. The columns in this DataFrame represent the different predictors for our second-level model. For simplicity, let's just define an intercept-only design matrix (exactly the same as we did using method 1):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Specifying the index is not stringly necessary, but it looks nice\n", "slm_dm = pd.DataFrame([1, 1], columns=['intercept'], index=('sub-001', 'sub-003'))\n", "slm_dm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the first-level contrast images (fl_cons) and the second-level design matrix (slm_dm), we can fit our second-level model:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slm_method2 = SecondLevelModel(smoothing_fwhm=5)\n", "slm_method2.fit(fl_cons, design_matrix=slm_dm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using this fitted second-level model for either method, we can also compute second-level contrasts. The way this is done is slightly different for each method. For method 1 (using fitted FirstLevelModel objects as inputs), the only option is to use the first_level_contrast parameter in the compute_contrast method. You can only use contrast definitions that were already present in the first-level model (because there is no second-level design matrix). \n", "\n", "For example, if we'd want to compute the average (across participants) group-level contrast of \"strongly_accept > baseline\", we'd do:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sr_average_method1 = slm_method1.compute_contrast(first_level_contrast='strongly_accept', output_type='z_score')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For method 2 (using first-level contrast images + a second-level design matrix), we cannot use the first_level_contrast parameter; instead, we should use the second_level_contrast parameter, which applies to contrasts based on the second-level design matrix. Because we also defined an intercept-only model for method 2, the only contrast we *can* evaluate is the intercept contrast:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sr_average_method2 = slm_method2.compute_contrast(second_level_contrast='intercept', output_type='z_score')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the the results from both methods are virtually the same; the only difference is the way we used the SecondLevelModel interface. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " ToDo Alright then, one last ToDo! Suppose that we want to construct a group-level model that investigates the difference between the first-level \"strongly_accept\" contrast images from sub-001 and sub-003 (a nonsensical analysis for just two subjects, but just ignore that for now). Can you construct an appropriate second-level design matrix for this analysis? Store this design matrix (a DataFrame) in a variable called slm_dm_todo.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "c2c901078799bebe40fe506f988a01e6", "grade": false, "grade_id": "cell-c7bb387bde078ef4", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Implement your ToDo here. '''\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "8b9d35e8174ff5050192962c251cf5c1", "grade": true, "grade_id": "cell-71c713db05860c41", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [ "raises-exception", "remove-output" ] }, "outputs": [], "source": [ "''' Tests the above ToDo. '''\n", "np.testing.assert_array_equal(slm_dm_todo.sum(axis=0).values, [1, 1])\n", "np.testing.assert_array_equal(slm_dm_todo.sum(axis=1).values, [1, 1])\n", "print(\"Well done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Concluding remarks\n", "Hopefully this tutorial gave you an idea how to use Nilearn to run statistical models! We recommend you check out the ample excellent tutorials and examples on the Nilearn [website](https://nilearn.github.io/) or to check out the codebase on [Github](https://github.com/nilearn/nilearn). As said before, Nilearn is a relatively novel project and its codebase may still change, so keep an eye out for new releases and extended functionality!\n", "\n", "That said, we hope that this tutorial helps you to get started with your analyses using Nilearn.
\n", "Happy hacking!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "170px", "width": "259px" }, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }