Detrending FreeSurfer Data
Over the course of a run, there can be a linear drift in the signal in different regions of the brain. There are many possible causes for this that have nothing to do with any interesting aspect of your data -- in other words, this linear drift is a nuisance artifact. The second step is to remove this signal drift from the data because it can introduce spurious correlations between two unrelated time series. You can see this for yourself in a quick experiment you could whip up in Excel: take two vectors of 100 randomly generated numbers (e.g., randbetween(1,99)). They should be uncorrelated. Now add 1, 2, 3, ... , 99, 100 to the values in each vector. This simulates a linear trend in the data. You shouldn't be surprised to find that the two vectors are now highly and positively correlated!
A script has been written called detrend.sh that removes the linear trend in your BOLD data. The latest version of this script can be found in /usr/local/sbin (which should be in your $PATH):
Update: The script has been modified to also regress out motion parameters from the mcprextreg files. This modification is not (yet) reflected in the code below.
The Script
The most recent version of the BASH script (detrend.sh) is below:
detrend.sh
#!/bin/bash USAGE="Usage: detrend.sh filepattern surf sub1 ... subN" if [ "$#" == "0" ]; then echo "$USAGE" exit 1 fi #first parameter is the filepattern for the .nii.gz time series to be detrended, up to the surface indicator #e.g., fmcpr.siemens.sm6.fsaverage.?h.nii.gz would use fmcpr.siemens.sm6 as the filepattern, fsaverage as the surf filepat="$1" shift #second parameter should be specified either as self or fsaverage surf="$1" shift #after the shift commands, all the arguments are shifted down two places and the first #2 arguments (the filepattern and surface) fall off the list. #The remaining arguments should be subject_ids subs=( "$@" ); hemis=( "lh" "rh" ); for sub in "${subs[@]}"; do source_dir=${SUBJECTS_DIR}/${sub}/bold echo ${source_dir} if [ ! -d ${source_dir} ]; then #The subject_id does not exist echo "${source_dir} does not exist!" else cd ${source_dir} readarray -t runs < runs for r in "${runs[@]}"; do if [ -n "${r}" ]; then #the -n test makes sure that the run number is not an empty string #caused by a trailing newline in the runs file for hemi in "${hemis[@]}"; do cd ${source_dir}/${r} pwd #subject_id does exist. Detrend if [ "${surf}" = "self" ]; then SURFTOUSE=${sub} else SURFTOUSE=fsaverage fi mri_glmfit --y ${source_dir}/${r}/${filepat}.${surf}.${hemi}.nii.gz \ --glmdir ${source_dir}/${r}/${hemi}.detrend \ --qa --save-yhat --eres-save \ --surf ${SURFTOUSE} ${hemi} mv ${source_dir}/${r}/${hemi}.detrend/eres.mgh ${source_dir}/${r}/${filepat}.${surf}.${hemi}.mgh done #now detrend the mni305 file mri_glmfit --y ${source_dir}/${r}/${filepat}.mni305.2mm.nii.gz \ --glmdir ${source_dir}/${r}/mni305.detrend \ --qa --save-yhat --eres-save mv ${source_dir}/${r}/mni305.detrend/eres.mgh ${source_dir}/${r}/${filepat}.${surf}.mni305.2mm.mgh fi done fi done
Running the Script
Before you run these scripts
Before running either of the scripts on this page, you will need to create a text file called 'runs' in the bold/ directory for each subject's dataset, e.g.,
- FS_T1_501/
- bold/
- runs
- 005/
- 006/
- bold/
The runs
file simply lists each run folder on its own line:
005 006
The detrend.sh script uses this file to determine the folders containing the data to be detrended. If this file doesn't already exist, you can manually generate it in any text editor (e.g., nano runs
or gedit runs
), but the quickest method takes advantage of the fact that the run folders all start with 0, and uses common command-line utilities:
SUBJECT=FS_501 cd ${SUBJECTS_DIR}/${SUBJECT}/bold #if there are fewer than 10 runs of BOLD data, then the run directories will probably have 2 leading zeros ls -1 | grep "^00*" > runs
Example script
Assuming all your subject folders have the same run folders to detrend, you would detrend multiple subjects using detrend.sh, specifying a file pattern for the source data (i.e., the name of the preprocessed files generated by FS-FAST, omitting anything after the '?h' hemisphere identifier), followed by a list of subject IDs:
#A SPECIFIC EXAMPLE: (note these parameters may differ substantially from what you would be typing in) detrend.sh fmcpr.sm6 self FS_T1_501 FS_T2_501 FS_T1_505 FS_T2_505 #For a more generalizable example of how you should call this function, see the section below using variables
The gist is that it calls the mri_glmfit function and saves the residuals after the linear trend has been removed from the data. Multiple files are generated in ?h.detrend/ directories in each run directory. The detrended data is subsequently copied back to the run directory as a new file called ${filepat}.?h.mgh, where ${filepat} is whatever file pattern you provided to the script (note that the source data are .nii.gz files, whereas the detrended data are .mgh files).
White Matter and CSF Signal
It is among best practices in functional connectivity studies to regress out WM and CSF signals from your time series before computing time series correlations, at least for resting state studies (Muschelli et al (2014)). We can add WM and CSF signal to the set of nuisance regressors when detrending our data (note that the procedure below removes the mean WM + CSF as a single component; it can be modified to remove WM and CSF independently):
Create a configuration file
An answer to a question I posted to the FreeSurfer mailing list pointed me to a walkthrough for functional connectivity analyses in FreeSurfer is the lead that I'm following. We will be using fcseed-sess to pull out our mean signal from seed regions. In our case, our seed regions will be from the white matter and csf masks. It's a 2-step process. The first step is to create a configuration file:
fcseed-config -cfg wmcsf.cfg -wm -vcsf -fsd bold -mean
This assumes that the BOLD data are in ${SUBJECTS_DIR}/${SUBID}/bold (-fsd), which is likely going to be the case if you've been following the FreeSurfer pipeline on this wiki. I ran the above step in ${SUBJECTS_DIR}.
Computing WM and CSF Regressor Values
Now you will have a configuration file, wmcsf.cfg that you will pass to the next step that performs the math on the fmri data:
fcseed-sess -s ${SUBID} -cfg wmcsf.cfg
or
fcseed-sess -sf subjects -cfg wmcsf.cfg
This will use the config file just created to extract a seed regressor for ${SUBID} (or for all subjects listed in your subjects file if you use the -sf switch to run in batch mode). You'll find a file called wmcsf in each of your BOLD/run directories. This file is a plaintext file with one column of values that you can use as a nuisance regressor. The detrend script could be modified to first run fcseed-sess to generate this value to include in the nuisance regressor matrix.
If you wanted to remove WM and CSF as two separate components, you would need to generate two separate .cfg files, and then run fcseed-sess twice.
Scripting
These steps have been combined with the detrend.sh script above to remove linear trends, motion parameters and wm+csf signal (as a single component). You will find detrend_wmcsf.sh in the ubfs/Scripts/Shell folder. However the script is reproduced below:
Note: I think there's a bug that prevents the script from working correctly when passed more than 1 subject at a time.
detrend_wmcsf.sh
#!/bin/bash USAGE="Usage: detrend_wmcsf.sh filepattern surf SUBJECT" if [ "$#" == "0" ]; then echo "$USAGE" exit 1 fi #first parameter is the filepattern for the .nii.gz time series to be detrended, up to the surface indicator #e.g., fmcpr.siemens.sm6.fsaverage.?h.nii.gz would use fmcpr.siemens.sm6 as the filepattern, fsaverage as the surf filepat="$1" shift #second parameter should be specified either as self or fsaverage surf="$1" shift #after the shift command, all the arguments are shifted down one place and the first argument (the filepattern) #falls off the list. The remaining arguments should be subject_ids subs=( "$@" ); hemis=( "lh" "rh" ); #Make sure we're in SUBJECTS_DIR to start so that we can make some assumptions about relative file locations cd ${SUBJECTS_DIR} #first, create configuration file for the wm & csf regression echo "Making fcseed config file..." fcseed-config -cfg wmcsf.cfg -wm -vcsf -fsd bold -mean echo "done" echo "Processing subjects..." for sub in "${subs[@]}" do source_dir=${SUBJECTS_DIR}/${sub}/bold echo ${source_dir} if [ ! -d ${source_dir} ]; then #The subject_id does not exist echo "${source_dir} does not exist!" else #The subject bold directory exists. Proceed. fcseed-sess -s ${sub} -cfg ${SUBJECTS_DIR}/wmcsf.cfg echo "Completed wm & csf nuisance regressor calculation for ${sub}" cd ${source_dir} ########readarray -t runlist < runs ### This is fine and all on Linux but MacOS does not have readarray #This is a MacOS compatible alternative to readarray that also works on Linux while IFS= read r; do runlist+=($r) done < runs for r in "${runlist[@]}"; do if [ -n "${r}" ]; then #the -n test makes sure that the run number is not an empty string #caused by a trailing newline in the runs file cd ${source_dir}/${r} pwd #subject_id does exist. Detrend if [ "${surf}" = "self" ]; then SURFTOUSE=${sub} else SURFTOUSE=fsaverage fi #this will generate the QA matrix in lh.detrend if needed: if [ ! -f ${source_dir}/${r}/lh.detrend/Xg.dat ]; then mri_glmfit --y ${source_dir}/${r}/${filepat}.${surf}.lh.nii.gz \ --glmdir ${source_dir}/${r}/lh.detrend \ --qa \ --surf ${SURFTOUSE} lh fi #copy and merge the QA, WMCSF and MC matrices echo "Generating nuisance regressor matrix for ${hemi} ${r}" paste ${source_dir}/${r}/lh.detrend/Xg.dat wmcsf mcprextreg > nuisance for hemi in "${hemis[@]}"; do #regress out nuisance mri_glmfit --y ${source_dir}/${r}/${filepat}.${surf}.${hemi}.nii.gz \ --glmdir ${source_dir}/${r}/${hemi}.detrend \ --X nuisance \ --no-contrasts-ok --save-yhat --eres-save \ --surf ${SURFTOUSE} ${hemi} #move and rename detrended files mv ${source_dir}/${r}/${hemi}.detrend/eres.mgh ${source_dir}/${r}/${filepat}.${surf}.${hemi}.mgh done #now detrend the mni305 file mri_glmfit --y ${source_dir}/${r}/${filepat}.mni305.2mm.nii.gz \ --glmdir ${source_dir}/${r}/mni305.detrend \ --X nuisance --no-contrasts-ok --save-yhat --eres-save mv ${source_dir}/${r}/mni305.detrend/eres.mgh ${source_dir}/${r}/${filepat}.${surf}.mni305.2mm.mgh fi done fi done
Using variables
As I mentioned in the comments in the sample code snippet above, what you type depends entirely on the filenames, which in turn depend entirely on how the data were preprocessed. You can use environment variables to help walk you through figuring out the correct file patterns. Also, a handy shortcut exists if you happen to have a subjects
file in $SUBJECTS_DIR
. Putting these two techniques together:
FILEPATTERN=fmcpr #the preprocessed files will almost always be called fmcpr SMOOTHING=".sm4" #how much smoothing did you use when you ran preproc-sess? SLICETIME=".up" #[".up" | ".down" | ".siemens" | OMIT ] SURFACE=fsaverage #[self | fsaverage] detrend.sh ${FILEPATTERN}${SLICETIME}${SMOOTHING} $SURFACE `cat subjects`
This will execute the detrend.sh script on all the subjects listed in the subjects text file, using the fsaverage surface. The assemblage of variables will look for files named fmcpr.up.sm4.*