Par-Files

From CCN Wiki
Jump to navigation Jump to search

Paradigm files, or "par-files" are tab-delimited plain-text files that specify to Freesurfer's FS-FAST program when different experimental conditions occurred within each functional run. They have a very straightforward structure, with one line per event or block. The files have between two and four columns according to various sources, though this documentation has changed with different releases of Freesurfer. What seems to work for our release is the following column structure:

Cumulative_Onset     Condition_Number_Code     Duration     Weight

The cumulative onset is specified in seconds and assumes that the first volume onset is at t=0 seconds. If initial volumes have been truncated from the 4D time series, adjust the onsets accordingly, depending on your TR. For example, if your TR is 2 seconds, and you drop the first 4 volumes of a 100-volume series, your resulting data will contain only 96 volumes, and the first event will appear 4*2=8 seconds earlier in the series.

The following is a snippet from a par file for an experiment with a fixed schedule (i.e., all participants had the same experimental timings, rather than a self-paced experiment):

12.000     6     2     1
16.813     3     2     1
21.626     3     2     1
26.038     3     2     1
30.048     3     2     1
34.059     3     2     1
38.470     6     2     1
42.481     4     2     1
46.492     4     2     1
51.305     1     2     1

Here, the experiment starts with the first event happening 12 seconds into the run. It is event condition 6, and the duration of this trial (and all others) is 2 seconds. All trials are given a weight of 1 in the analysis (weights of 0 might be assigned if, for example, you want to discount activity associated with error trials).

An important thing to note: all .par files associated with a particular analysis must have the same filename. This is possible because each .par file is stored in the same directory as the fMRI data that it corresponds to, and the fMRI data for each run is kept in its own directory. For example, you might have the following directory structure:

  • FS_T1_501
    • bold
      • 001
        • f.nii
        • booth.par
      • 002
        • f.nii
        • booth.par

See the note further down about renaming par-files, but THE NAME OF YOUR PAR FILE WILL NOT ALWAYS BE 'BOOTH.PAR'!

When analyzing a fixed-schedule experiment, you should be able to simply copy the appropriate pre-existing par-file for that run from a source directory. For example, the par-files associated with the Booth data can be found in the par_files directory, in subdirectories corresponding with their Reading_Experiment_IDs.

When analyzing data from a self-paced experiment, each participant will have a unique par-file that will have to be generated from his/her MATLAB experimental data file.

Generating Par Files from our own Self-Paced Experiments

A MATLAB function, FSTSExtractor, has been written to extract the schedule of events from the .mat files generated by our PsychToolBox experiment scripts. Note that the structure of these .mat files is particular to our scripts, and that you shouldn't expect this procedure to work on experiment data files generated by any other means. However, the MATLAB code for this script is provided below and can be modified to accommodate other input data.

note: If you are working with the SemCat experiment, you should be using FSSEMExtractor.m.

FSTSExtractor.m

function stamps=FSTSExtractor(varargin)
%% function stamps=TSExtractor()
% Extract all the timestamps associated with each condition stored in a .mat file.
% These values are saved in a, but also returned as a vector called stamps (which can be ignored).
% It is assumed that the .mat files are named
% PREFIX_Sub_SUBJECTNUMBER_Run_RUNNUMBER_DD_MMM_YYYY.mat
%
% Mandatory arguments:
%  subject: a string or cell array of strings
%      run: an integer or array of integers
%     date: a serial date number representing a particular date as an
%           integer. This can be created calling the datenum() function. E.g.
%           datenum('19-Nov-2015')
%	tr: scan interval (E.g. 2.047)
%	volumes_dropped: number of initial volumes dropped
%
% Optional arguments:
%     prefix: a string, such as 'LDT_' that begins your .mat files (default:
%             empty string)
%  directory: a string, such as '/home/experimenter/fmri/data' that contains 
%             your .mat files (default: current working directory) 
%
% Sample usage:
% >> d=datenum('19-Nov-2015');
% >> FSTSExtractor('prefix', 'LDT_', 'subject', 1000, 'run', 1, 'TR', 2.047,'volumes_dropped', 5, 'date', d)
% >> FSTSExtractor('prefix', 'LDT_', 'subject', 1001, 'run', 1, 'TR', 2.047,'volumes_dropped', 5, 'date', d, ...
%    'directory', '/data/subjects/1001/')
% define defaults at the beginning of the code so that you do not need to
% scroll way down in case you want to change something or if the help is
% incomplete
options = struct('prefix',,'subject',, 'run', 0,'tr', 0,'volumes_dropped', 0 'date', datenum(date()), 'directory', pwd()); 
% read the acceptable names
optionNames = fieldnames(options);
% count arguments
nArgs = length(varargin);
if round(nArgs/2)~=nArgs/2
   error('TSExtractor needs propertyName/propertyValue pairs')
end
for pair = reshape(varargin,2,[]) % pair is {propName;propValue}
   inpName = lower(pair{1}); % make case insensitive    
   if any(strcmp(inpName,optionNames))
       % overwrite options. If you want you can test for the right class here
       % Also, if you find out that there is an option you keep getting wrong,
       % you can use "if strcmp(inpName,'problemOption'),testMore,end"-statements
       options.(inpName) = pair{2};
   else
       error('%s is not a recognized parameter name',inpName)
   end
end
filename = ([options.prefix 'Sub_' num2str(options.subject) '_Run_' num2str(options.run) '_' datestr(options.date) '.mat']);
s = load([options.directory filesep filename]);
writefile=sprintf('Run_%d.par', options.run);
tempts=[s.expinfo.data.timestamp];
tempcn=[s.expinfo.data.conditon];
tempdn=[s.expinfo.data.rt];
tempwt=ones(1,length(tempcn));
%Make sure the condition numbers start at 1. If a condition is labeled 0, this is a problem. I think 0 is the implicit baseline or something
if(min(tempcn)<1)
tempcn=tempcn+1;
end
%Correct onset times for number of volumes dropped
timecorr = tr * volumes_dropped;
tempts=tempts - timecorr;
%write
parmatrix=[tempts' double(tempcn') tempdn' tempwt']
dlmwrite(writefile, parmatrix, '\t')
end

Using FSTSExtractor

This function is supposed to be dead-easy to use to create .par files for FreeSurfer. Just follow these three steps:

  1. Copy your .mat files from the laptop
  2. Run FSTSExtractor in MATLAB
  3. Rename your .par files and copy them to the appropriate run directories

Copy Experiment .mat Files

If you have not already done so, copy the relevant .mat experiment files from the experiment laptop. They should be in the /home/experimenter/Documents/MATLAB/Experiments/experimentname/ folder, and follow the naming format convention PREFIX_Sub_SUBJECTNUMBER_Run_RUNNUMBER_DD_MMM_YYYY.mat (e.g., LDT_Sub_1001_Run_1_09_MAY_2016.mat). Place these files in a directory called 'Onsets' within the subject's bold folder, so that the directory tree for that participant looks something like this:

  • $SUBJECTS_DIR/
    • SUBJECT_ID/
      • bold/
        • 001/
        • 002/
        • ...
        • 006/
        • Onsets/
          • TestExperiment_Sub_SUBJECT_ID_Run_1_09_MAY_2016.mat
          • TestExperiment_Sub_SUBJECT_ID_Run_2_09_MAY_2016.mat
          • ...
          • TestExperiment_Sub_SUBJECT_ID_Run_6_09_MAY_2016.mat

There should be one .mat file for each run directory (i.e., if your runs are numbered 011 to 016, then there are 6 run directories, and therefore you should have 6 .mat files). If these numbers don't match up, something is very wrong, so stop what you're doing and solve that problem first.

Run FSTSExtractor.m in MATLAB

This function is a MATLAB utility, and so it will need to be in your MATLAB path, and you will have to run it in MATLAB. Navigate to the Onsets/ directory you created and launch MATLAB from your terminal window:

matlab &

The rest of the instructions in this section are carried out in the MATLAB console.

Check to ensure that FSTSExtractor is in your path:

help FSTSExtractor

You should see a bunch of explanatory help text that includes sample usage. If you get an error message saying FSTSExtractor not found, then this function is not in your path. Fix that problem.

Assuming the call for help displayed the program help for FSTSExtractor, use the sample usage as a model for applying it to your new .mat files. When you have run it, you should find you have created a series of Run_??.par files.

Rename and Copy .par files

When you run your GLM, you will specify a single name for the .par file to be used. Thus, all your .par files will have to be assigned the same name. For each of the newly created Run_??.par files, move the file to the corresponding bold/0??/ run directory (e.g., Run_11.par should be copied to the bold/011/ directory). As each file is moved over, rename the file to your chosen name. If other .par files have been created for other participants, you should use whatever name was chosen for those data.

IMPORTANT: The examples above use 'booth.par' as a sample file name. Do not just blindly name everything 'booth.par'! The name of the par-file should be meaningfully related to the experiment they are for!

For the reading development study, collected in the James Booth lab, we use booth.par as a file name convention. For our Lexical Decision Task (LDT), we have been using LDT.par as a file name convention. Using a consistent and appropriate file naming convention will help avoid confusion and ambiguity.

Generating PAR Files for Archival Data

We often work with archival data from other experiments that naturally store their experiment meta-data in other formats. The procedures for generating par files for these data can vary wildly, so I will document them by project.

ABCD Study

The ABCD Study includes the run-time data (responses, event timings, etc.) in the same .tgz tarball that contains the raw fMRI DICOM data. Unpacking the tarballs for a participant will produce a complete directory structure for that participant named sub-<SUBJECTID>. Under the subject folder will be a sub-folder for each of the experiment time points (Baseline, 2-year follow-up, etc.). For each time point there is an anat/ subdirectory and a func/ subdirectory. The run-time data for both functional runs can be found in the sub-<SUBJECTID>/<timepoint>/func/*-EventRelatedInformation.txt files. Note there are 2 such files, but these files are identical. Trying to figure out how to parse these files is a nightmare.

First: Download abcd_extract_eprime

Fortunately someone with knowledge of the file data did the hard work for us. You will need to grab some MATLAB scripts from a GitHub repository: [1]. From the repository, you can download a .zip file containing all the relevant MATLAB files. Unzip the archive and put all the files into a folder on your MATLAB path (e.g., ~/Documents/MATLAB/). This is important because these scripts need to be in your path in order for them to be visible when we execute them from the command line using a script I have written. These scripts are in a script archive called ABCD_SCRIPTS.tgz that I will be making available on /ubfs/caset/cpmcnorg/Scripts/. Unpack the archive and run the installer script (see below).

Second: Grab my Python and BASH Scripts

I have written a shell script (abcd_parfiles.sh) and a python script (abcd_to_par.py) to handle all the logic in the event timing parsing. Different datasets acquired with different scanners require different parameters when using the above MATLAB script. The shell script calls the python script, which in turn calls the matlab script. I have copied these scripts to /opt/scripts, which should be in your $PATH, meaning that you should be able to use these scripts without doing anything special. To check, type the following in the command-line terminal:

which abcd_to_par.py

You should then see:

/opt/scripts/abcd_to_par.py

If, for whatever reason this is not the case, read on ...

Running the Installer Script

You can find the script archive in /ubfs/caset/cpmcnorg/Scripts/ABCD_SCRIPTS.tgz. Grab the ABCD_SCRIPT.tgz file, unpack it, and run the installer script I wrote.

cp /ubfs/caset/cpmcnorg/Scripts/ABCD_SCRIPTS.tgz ./
tar -xzvf ABCD_SCRIPTS.tgz
cd ABCD_SCRIPTS
chmod u+x ./install
chmod u+x *.*
./install

This script will move the MATLAB scripts to ~/Documents/MATLAB (and barf if that directory does not exist), and move the python and bash scripts to ~/bin (creating ~/bin if necessary). Note there is a minor error in the installer script. It reloads the .bashrc file, but that won't update your path. Instead, after you run the installer, you need to reload your .profile file:

source ~/.profile

Now your $PATH will be updated to include ~/bin and you will be able to use any scripts located therein.

Using the ABCD scripts to generate .par files

If all scripts are in place, .par file generation is quite easy: Assuming you have a set of unpacked ABCD archive data, your project directory will contain a bunch of subject directories:

  • ProjectDir/
    • sub-NDARINV17K4X0WD/
    • sub-NDARINV4JVM9WML/
    • sub-NDARINVJKBDRF1B/

If you have previously used helper scripts to convert subject DICOM files to NiFTi files, there will also be a NII directory:

  • ProjectDir/
    • NII/
    • sub-NDARINV17K4X0WD/
    • sub-NDARINV4JVM9WML/
    • sub-NDARINVJKBDRF1B/

The process is driven by the abcd_parfiles.sh script, which requires a subject ID as a parameter. The subject ID is the NDARINV* part of the subject folder names. For example:

abcd_parfiles.sh NDARINV17K4X0WD

This launches MATLAB in the terminal window to run the abcd_extract_eprime scripts. Python is used to read in the output files, and then the shell script handles the file organization and folder creation, if necessary. After the script has completed, you will find the .par files in func/<run>/nback.par:

  • ProjectDir/
    • NII/
      • NDARINV17K4X0WD/
        • func/
          • 001/
            • nback.par
          • 002/
            • nback.par
          • 003/
            • nback.par
          • 004/
            • nback.par

If you have previously converted DICOM to NiFTi files, the run directories will already exist, and you will additionally find the f.nii files along side their respective nback.par files.

Aggregating PAR Files

FreeSurfer requires the .par files for each functional run to be stored in the run directory along-side the functional data with which they are associated. There may be occasions where you will want to collect all the .par files together. You can use the code below to create a shell script that will aggregate copies of all the par files found in each participant's directories. Because all .par files for analysis must have the same name, we can't simply copy the .par files, or they will overwrite each other. For this reason, the code below will rename the .par files into 001.par, 002.par, etc.

collectparfiles.sh
#!/bin/bash

#subjects
subs=( "$@" );
for sub in "${subs[@]}"; do
  source_dir=${SUBJECTS_DIR}/${sub}/
  cd $source_dir
  CTR=0
  find ${source_dir} -name "*.par" | sort > parfiles
  readarray -t runs < parfiles
  for r in "${runs[@]}"; do
     ((CTR+=1))
     FPREFIX=`printf "%.03d" "$CTR"`
	   cp $r $source_dir/$FPREFIX.par
  done
  cd ${SUBJECTS_DIR}
done

You would run this script thus:

collectparfiles.sh FS_*