Time Series Data in Surface Space

From CCN Wiki
Jump to navigation Jump to search

We can calculate mean time course vectors calculated across all voxels within regions defined in a FreeSurfer annotation (.annot) file (we can also extract the time courses on a vertex-by-vertex basis -- see below). Though any annotation file can be used for this purpose, we have been working at the scale of the Lausanne parcellation. A script exists, gettimecourses.sh, that will handle this:

gettimecourses.sh

Update Jan 11, 2018: I added a feature that uses the filepattern string to determine whether to use the self or fsaverage annotation.

#!/bin/bash
USAGE="Usage: gettimecourses.sh annot filepattern sub1 ... subN"

if [ "$#" == "0" ]; then
        echo "$USAGE"
        exit 1
fi

#first two parameters are is the annot files and filepatterns for the \
#.nii.gz time series to be detrended, up to the hemisphere indicator
#e.g., detrend.fmcpr.sm6.self.?h.nii.gz would use detrend.fmcpr.sm6.self as the filepattern
annot="$1"
shift
filepat="$1"
shift

#subjects
subs=( "$@" );
#hemispheres
hemis=( lh rh )

for sub in "${subs[@]}"; do
source_dir=${SUBJECTS_DIR}/${sub}/bold

#Which annotation? Look for "fsaverage" or "self" in the filepat
  if [[ "${filepat}" =~ "fsaverage" ]]
  then
     SURF="fsaverage"
  elif [[ "${filepat}" =~ "self" ]]
  then
     SURF=${sub}
  fi

   if [ ! -d ${source_dir} ]; then
       #The subject_id does not exist
       echo "${source_dir} does not exist!"
   else
       cd ${source_dir}
       readarray -t runs < runs
       for hemi in "${hemis[@]}"; do
               for r in "${runs[@]}"; do
                       mri_segstats \
                       --annot ${SURF} ${hemi} ${annot} \
                       --i ${source_dir}/${r}/${filepat}.${hemi}.mgh \
                       --sum ${sub}_${hemi}_${annot}_${r}.${filepat}.sum.txt \
                       --avgwf ${sub}_${hemi}_${annot}_${r}.${filepat}.wav.txt
               done
       done
    fi
done

Running the Script

This script has two preconditions (conditions that must be satisfied prior to running the script)

  1. The first is that the data have been detrended following the procedure for detrending FreeSurfer data.
  2. The second precondition is that there should be a file called 'runs' in the bold/ directory.

If you have followed the instructions for detrending your data, both conditions should already be met.

The script is run in a terminal by specifying the name of an annot file that can be found in the $SUBJECT_ID/label/ directory, a file pattern, followed by a list of subject IDs:

gettimecourses.sh lausanne fmcpr.siemens.sm6.self FS_T1_501

The above command would look in the /label directory for subject FS_T1_501 for lh.lausanne.annot and rh.lausanne.annot, and extract the mean time series for each defined region within each .annot file for the bold data found in the file fmcpr.sm6.self.?h.mgz in the run folders indicated in the runs file. A series of output files is produced in the subject's bold/ directory. The time series files are named ${sub}_?h_${annot}_${run}.${filepattern}.wav.txt

A Trick

Similar to the shortcut described for detrending the data, you can take advantage of a subjects file in your SUBJECTS_DIR:

FILEPATTERN=fmcpr.siemens.sm4.self
ANNOT=lausanne
gettimecourses.sh $ANNOT $FILEPATTERN `cat subjects`

Decoding the Regions

The time series output files are plain text files with rows=time points and columns=regions. In the case of the Lausanne 2008 parcellation, there are approximately 500 columns per file. Interpreting these data will require some sort of reference for the identity of each column.

As you might expect, there is a relationship between the column order and the segmentation ID, as confirmed by this archived email exchange involving yours truly: Extracting resting state time series from surface space using Lausanne 2008 parcellation

When you ran the gettimecourses.sh script, two files were created for each run folder: one containing the time series data (*.wav.txt) and one with some summary data (*.sum.txt). You can use the information from the 5th column of the .sum files to determine the segment labels corresponding to the columns of your time series data. A MATLAB function, ParseFSSegments.m, has been written (ubfs /Scripts/Matlab folder) that will extract the data from these summary files.

%if called without providing filename will launch a dialog box to select the .sum.txt to parse 
lh_summary=parseFSSegments(); 
%If you already know the filename, you can provide it as a parameter
rh_summary=parseFSSegments('FS_T1_501_rh_myaparc_60_005.fmcpr.sm6.sum.txt'); 

lh_region_names=lh_summary.segname;
rh_region_names=rh_summary.segname; 

If there are regions that you have dropped from your time series data (e.g., Corpus Callosum), you will likely want to also drop those labels from your list of region names to maintain a correspondence between the region labels and your time series columns:

%In this example, my lausanne annotations have Corpus Callosum as the first label in both lh and rh files, but I have dropped the first column when I imported my time series
%I will consequently be dropping the first label from each of the lists of region names:
lh_region_names=lh_region_names(2:end);
rh_region_names=rh_region_names(2:end);

Save Region Names

To make a single cell array containing the names of all your regions, you'll need to concatenate your lh and rh region names, and prefix them to distinguish between regions with corresponding names in each hemisphere:

%Concatenate the 499 + 501 element cell arrays into a single 1000-element cell array
%Note that I am concatenating the lh label names followed by the rh label names because 
%that is the order of the columns in the time series data (reading across columns left to right)
labels=vertcat(lh_region_names, rh_region_names);

%Now make a list of 'lh.' and 'rh.' prefixes that I will prepend to each of the labels
%Note that I am making sure that my list of prefixes has 499 'lh.' entries followed by 501 'rh.' entries, corresponding to how I concatenated my labels
ells=repmat({'lh.'},length(lh_region_names),1);
arrs=repmat({'rh.'},length(rh_region_names),1);
prefixes=vertcat(ells,arrs);

%Now use cellfun to prepend my prefixes to my labels
allregions=cellfun(@(x,y) [x y], prefixes, labels, 'UniformOutput', false);
save('region_labels.mat', 'allregions'); %save so we don't have to do this again!

Using Region Names to Select Relevant Columns

Now we can find the columns in our data corresponding to the region labels we care about. For example, suppose I wanted to restrict a time series analysis to the left STG:

idx=strfind(labels, 'lh.superiortemporal'); %this will find any label that contains the substring 'lh.superiortemporal'
idxvals=find(not(cellfun('isempty', idx))); %Find the indices of the non-empty cells from the previous step. Voila!

Note that the segmentation names will be consistent across participants for a given segmentation scheme. In practical terms, this means you should only have to do this once for a given .annot file, though it's probably worth double checking a few .sum.txt files to make sure the regions are listed in the same order because Murphy's law dictates that sometimes things get screwed up.

Other Information

The FreeSurfer command mris_anatomical_stats uses an .annot file to query properties of a surface. The output of this program can be used to determine anatomical characteristics of each region. These values might be useful for modeling purposes.

SUBJECT_ID=FS_T1_501
ANNOT=lausanne
HEMIS=( "lh" "rh" )
for hemi in "${HEMIS[@]}"; do
   mris_anatomical_stats -a ${SUBJECT_ID}/label/${hemi}.${ANNOT}.annot -f ${SUBJECTS_DIR}/${SUBJECT_ID}/${hemi}.${ANNOT}.stats.txt ${SUBJECT_ID} ${hemi}
done

The above BASH code snippet will report characteristics for subject FS_T1_501, using the ?h.lausanne.annot files in his/her labels/ directory. The for-loop is a concise way of ensuring that the lh and rh hemispheres are measured, with the results being written to lh.lausanne.stats.txt and rh.lausanne.stats.txt in his/her subject directory (i.e., in $SUBJECTS_DIR/FS_T1_501/)

Vertex-wise Time Series

The above methods extract the mean time series across all the voxels contained in the regions indexed by the vertices in a given region or set of regions. Some techniques, such as MVPA, however, work on the single-voxel level. I asked the Freesurfer mailing list about this and got the following suggestion for using MATLAB to obtain 1 time series per vertex:

waveforms = fast_vol2mat(MRIread('waveform.nii.gz'));

Now What?

Your time series data can be used to carry out a functional connectivity analysis (the correlational or neural network approach) or machine learning classification.