Time Series Data in Surface Space

From CCN Wiki
Jump to navigation Jump to search

After linear trends have been removed from the data, we can calculate mean time course vectors calculated across all voxels within regions defined in a FreeSurfer annotation (.annot) file (we can also extract the time courses on a vertex-by-vertex basis -- see below). Though any annotation file can be used for this purpose, we have been working at the scale of the Lausanne parcellation. A script exists, gettimecourses.sh, that will handle this:

surfacetimecourses.sh

This script should found in /opt/scripts and should be in your path. If not, let Chris know about the oversight. This script only extracts timecourses from the .lh and .rh surfaces. A separate script will extract timecourses from MNI305 subcortical space.

#!/bin/bash
USAGE="Usage: surfacetimecourses.sh annot filepattern surface sub1 ... subN"

if [ "$#" == "0" ]; then
       echo "$USAGE"
       exit 1
fi

# first two parameters are is the annot files and filepatterns for the \
# .nii.gz time series to be detrended, up to the hemisphere indicator 
# e.g., fmcpr.up.sm4.fsaverage.?h.mgh would use fmcpr.up.sm4  
# as the filepattern, and fsaverage as the surface
annot="$1"
shift
filepat="$1"
shift
surf="$1"
shift

#subjects
subs=( "$@" );
#hemispheres
hemis=( lh rh )

for sub in "${subs[@]}"; do
source_dir=${SUBJECTS_DIR}/${sub}/bold

SURF="fsaverage" #by default

#Which annotation? Look for "fsaverage" or "self" in the surf
 if [[ "${surf}" =~ "fsaverage" ]]
 then
    SURF="fsaverage"
 elif [[ "${surf}" =~ "self" ]]
 then
    SURF=${sub}
 fi

  if [ ! -d ${source_dir} ]; then
      #The subject_id does not exist
      echo "${source_dir} does not exist!"
  else
      cd ${source_dir}
      readarray -t runs < runs
      for r in "${runs[@]}"; do
               for hemi in "${hemis[@]}"; do
                      mri_segstats \
                      --annot ${SURF} ${hemi} ${annot} \
                      --i ${source_dir}/${r}/${filepat}.${surf}.${hemi}.mgh \
                      --sum ${sub}_${hemi}_${annot}_${r}.${filepat}.sum.txt \
                      --avgwf ${sub}_${hemi}_${annot}_${r}.${filepat}.wav.txt
               done
               #get time course from each segment in mni305 space
               mri_segstats \
               --seg ${FREESURFER_HOME}/subjects/fsaverage/mri.2mm/aseg.mgz \
               --i ${source_dir}/${r}/${filepat}.mni305.2mm.mgh \
               --ctab-default \
               --sum ${sub}_subcort_${r}.${filepat}.sum.txt \
               --avgwf ${sub}_subcort_${r}.${filepat}.wav.txt
      done
   fi
done

Running the Script

This script has two preconditions (conditions that must be satisfied prior to running the script)

  1. The first is that the data have been detrended following the procedure for detrending FreeSurfer data.
  2. The second precondition is that there should be a file called 'runs' in the bold/ directory.

If you have followed the instructions for detrending your data, both conditions should already be met.

The script is run in a terminal by specifying the name of an annot file that can be found in the $SUBJECT_ID/label/ directory, a file pattern, followed by a list of subject IDs:

surfacetimecourses.sh lausanne fmcpr.siemens.sm6 fsaverage FS_T1_501

The above command would look in the fsaverage/label directory for lh.lausanne.annot and rh.lausanne.annot, and extract the mean time series for each defined region within each .annot file for the bold data found in the file fmcpr.sm6.fsaverage.?h.mgh and fmcpr.sm6.mni305.2mm.mgh in the run folders indicated in the runs file. A series of output files is produced in the subject's bold/ directory. The time series files are named ${sub}_?h_${annot}_${run}.${filepattern}.wav.txt

Supplemental .annot Files

The stock set of .annot files in the default fsaverage/label folder may not include the parcellation that you're interested in. Look in the UBFS openfmri/label folder for supplemental .annot files that have been generated for other projects. The lausanne parcellation is found in there.

A Trick

Similar to the shortcut described for detrending the data, you can take advantage of a subjects file in your SUBJECTS_DIR:

FILEPATTERN=fmcpr.up.sm4
ANNOT=lausanne
SURF=fsaverage
surfacetimecourses.sh $ANNOT $FILEPATTERN $SURF `cat subjects`

Decoding the Regions

The time series output files are plain text files with rows=time points and columns=regions. In the case of the Lausanne 2008 parcellation, there are approximately 500 columns per file. Interpreting these data will require some sort of reference for the identity of each column.

As you might expect, there is a relationship between the column order and the segmentation ID, as confirmed by this archived email exchange involving yours truly: Extracting resting state time series from surface space using Lausanne 2008 parcellation

When you ran the gettimecourses.sh script, two files were created for each run folder: one containing the time series data (*.wav.txt) and one with some summary data (*.sum.txt). You can use the information from the 5th column of the .sum files to determine the segment labels corresponding to the columns of your time series data. A MATLAB function, ParseFSSegments.m, has been written (ubfs /Scripts/Matlab folder) that will extract the data from these summary files.

%if called without providing filename will launch a dialog box to select the .sum.txt to parse 
lh_summary=parseFSSegments(); 
rh_summary=parseFSSegments();
%If you already know the filename, you can provide it as a parameter
mni_summary=parseFSSegments('FS_0183_subcort_lausanne_001.fmcpr.up.sm4.sum.txt'); 

lh_region_names=lh_summary.segname;
rh_region_names=rh_summary.segname; 
mni_region_names=mni_summary.segname;

If there are regions that you have dropped from your time series data (e.g., Corpus Callosum), you will likely want to also drop those labels from your list of region names to maintain a correspondence between the region labels and your time series columns:

%In this example, my lausanne annotations have Corpus Callosum as the first label in both lh and rh files, but I have dropped the first column when I imported my time series
%I will consequently be dropping the first label from each of the lists of region names:
lh_region_names=lh_region_names(2:end);
rh_region_names=rh_region_names(2:end);

If you have saved the vectors of dropped columns, this would be a handy place to recycle this information:

lidx=1:length(lh_summary.segname);
lidx(ldrop)=[];
lh_region_names=lh_summary.segname(lidx);
ridx=1:length(rh_summary.segname);
ridx(rdrop)=[];
rh_region_names=rh_summary.segname(ridx);
mnidx=1:length(mni_summary.segname);
mnidx(mnidrop)=[];
mni_region_names=mni_summary.segname(mnidx);

Save Region Names

To make a single cell array containing the names of all your regions, you'll need to concatenate your region names, and prefix them to distinguish between regions with corresponding names in each hemisphere:

%Concatenate the 3 cell arrays into a single cell array
%Note that I am concatenating the lh label names followed by the rh label names and then mni label names because 
%that is the order of the columns in the time series data (reading across columns left to right)
labels=vertcat(lh_region_names, rh_region_names, mni_region_names);

%Now make a list of 'lh.' and 'rh.' prefixes that I will prepend to each of the labels
%Note that I am making sure that my list of prefixes has 499 'lh.' entries followed by 501 'rh.' entries, corresponding to how I concatenated my labels
ells=repmat({'lh.'},length(lh_region_names),1);
arrs=repmat({'rh.'},length(rh_region_names),1);
emms=repmat({'mni.'},length(mni_region_names),1);
prefixes=vertcat(ells,arrs, emms);

%Now use cellfun to prepend my prefixes to my labels
allregions=cellfun(@(x,y) [x y], prefixes, labels, 'UniformOutput', false);
save('region_labels.mat', 'allregions'); %save so we don't have to do this again!

Using Region Names to Select Relevant Columns

Now we can find the columns in our data corresponding to the region labels we care about. For example, suppose I wanted to restrict a time series analysis to the left STG:

idx=strfind(labels, 'lh.superiortemporal'); %this will find any label that contains the substring 'lh.superiortemporal'
idxvals=find(not(cellfun('isempty', idx))); %Find the indices of the non-empty cells from the previous step. Voila!

Note that the segmentation names will be consistent across participants for a given segmentation scheme. In practical terms, this means you should only have to do this once for a given .annot file, though it's probably worth double checking a few .sum.txt files to make sure the regions are listed in the same order because Murphy's law dictates that sometimes things get screwed up.

Other Information

The FreeSurfer command mris_anatomical_stats uses an .annot file to query properties of a surface. The output of this program can be used to determine anatomical characteristics of each region. These values might be useful for modeling purposes.

SUBJECT_ID=FS_T1_501
ANNOT=lausanne
HEMIS=( "lh" "rh" )
for hemi in "${HEMIS[@]}"; do
   mris_anatomical_stats -a ${SUBJECT_ID}/label/${hemi}.${ANNOT}.annot -f ${SUBJECTS_DIR}/${SUBJECT_ID}/${hemi}.${ANNOT}.stats.txt ${SUBJECT_ID} ${hemi}
done

The above BASH code snippet will report characteristics for subject FS_T1_501, using the ?h.lausanne.annot files in his/her labels/ directory. The for-loop is a concise way of ensuring that the lh and rh hemispheres are measured, with the results being written to lh.lausanne.stats.txt and rh.lausanne.stats.txt in his/her subject directory (i.e., in $SUBJECTS_DIR/FS_T1_501/)

Vertex-wise Time Series

The above methods extract the mean time series across all the voxels contained in the regions indexed by the vertices in a given region or set of regions. Some techniques, such as MVPA, however, work on the single-voxel level. I asked the Freesurfer mailing list about this and got the following suggestion for using MATLAB to obtain 1 time series per vertex:

waveforms = fast_vol2mat(MRIread('waveform.nii.gz'));

Now What?

Your time series data can be used to carry out a functional connectivity analysis (the correlational or neural network approach) or machine learning classification.