Functional Connectivity (Cross-Correlation Method)

From CCN Wiki
Jump to navigation Jump to search

If you consider two vectors containing t values representing neural activity over t time points, the degree to which the values in these vectors move together, as indexed by the correlation between these vectors, can be taken as a measure of functional connectivity. The underlying assumption is that the activity in two regions should be related if there is some sort of underlying connection between them.

This page assumes you have already obtained time series matrices using the approaches described here. If these data were obtained in FreeSurfer surface space, there will be 2 matrices (lh, rh) times 'r' runs. As I have not yet worked out the process for voxel-space (i.e., using SPM), details are sketchy, but it seems probable that there would only be a single matrix of values per run. In either case, some data cleaning may be required.

These steps will be carried out in MATLAB. The first step then would be to start up MATLAB in your terminal (the ampersand lets the program run in the background so that you can continue to use your terminal window to do other things if you wish):

cd $SUBJECTS_DIR
matlab &

Now that we're in MATLAB, any code snippets can be interpreted as MATLAB code unless otherwise noted.

The next step is to load your data matrices. Assuming the time series are plaintext files containing only structured numerical data (e.g., as produced by the gettimecourses.sh script), you can use the load function to import these data.

cd (['FS_T1_501' filesep 'bold']');
ls('*.wav.txt');

The first of the above two commands will get you into the bold/ directory for subject FS_T1_501, which is where you would expect to find the time series data. The second command will display the names of all files ending in .wav.txt, which is the convention used by the gettimecourses.sh script. Now that you have identified the relevant files, you can import them individually, assigning them meaningful variable names:

run05_left=load('FS_T1_501_lh_lausanne_005.fmcpr.sm6.self.wav.txt');
run05_right=load('FS_T1_501_rh_lausanne_005.fmcpr.sm6.self.wav.txt');
run06_left=load('FS_T1_501_lh_lausanne_006.fmcpr.sm6.self.wav.txt');
run06_right=load('FS_T1_501_rh_lausanne_006.fmcpr.sm6.self.wav.txt');

If you are using the Lausanne 2008 100-region parcellation, you should find that the lh matrices have 500 columns (regions) and the rh matrices have 502.

At this point, it makes sense to merge the left and right hemisphere data into a single matrix, but continue to keep the data for each run separate. If each row represents a time point and each column represents a region, the matrices for a given run will be time x nregions in size. We want to concatenate the matrices together along the column dimension so that the resulting matrix is time x (nregions + nregions) in size (and not (time+time) x nregions!). Horizontal concatenation (i.e., along the columnar dimension, which is what you want to do here) is easily done in MATLAB:

run_05 = [run05_left run05_right];
run_06 = [run06_left run06_right];

Data Cleaning

Steps undertaken during earlier fMRI data conversion and/or preprocessing may influence the sorts of steps you wish to perform in order to clean up your data:

Drop Initial Volumes

When fMRI data are acquired, there is a short period at the beginning of each run, perhaps 6-12 seconds, during which the magnetic field fluctuates as the gradients are turned on. Measurements taken during this time are likely to contain artifacts of this instability. You can see this by plotting the values from a single region over time:

plot(run_05(:,1));

The above command plots the value from all rows (the ':' entry means 'everything') in the first column (the '1' entry for the column), and would show the activation of region 1 across the entire run.

If we know the TR, in seconds, we can figure out the number of volumes to drop from our matrices in order to eliminate these bad measurements. For example, if our TR is 2 seconds and we want to drop the first 12 seconds of data, we are dropping the first 6 rows of each matrix:

run_05 = run_05(7:end,:);
run_06 = run_06(7:end,:);

The above commands are replacing the run_05 (and _06) matrices with what's left over when you retain rows 7-whatever the last row is. Note that the ':' in the columns selector is saying that you want to retain all the columns. After running these commands, you should find your matrices are 6 elements smaller in the rows dimension but contain the same number of columns.

Drop Regions

The same principle can be applied in the columns dimension if there are regions you wish to omit. For example, the first region in the Lausanne annotation is the corpus callosum (a white matter tract) which would be reasonable to exclude from a connectivity analysis. Note that if you drop regions from your time series, you will need to keep that in mind when labeling the regions in your connectivity matrices. If you plan on dropping regions, it is easiest to do so prior to merging hemispheres. The code snippet below drops the first column (by retaining only columns 2-whatever the last column is) from each of the matrices for run 5. The command for run 6 would be very similar:

run05_left=run05_left(:,2:end);
run05_right=run05_right(:,2:end);

Normalize Data

The values in the activity matrices are aribtrary units: the absolute values are not important. As a result, these values are instead interpreted in terms of % of signal change over time. For example, if a region had a value of 50 at t=0 and a value of 75 at t=1, that would represent a 50% change in signal (which, incidentally, is an unrealistic change, and would probably represent some sort of artifact). In contrast, changing from a value of 1000 to 1025 represents a 2.5% signal change for the same difference in absolute signal units. To make it easier to identify outlier values and ensure all data are on the same scale, we normalize our data using the zscore function.

zrun05=zscore(run05);
zrun06=zscore(run06);

The normalized matrix will have z-scores for each time point, calculated using the mean and standard deviation for each column (region) separately. Thus, a normalized value of 1 in Region A will mean that the activation value was 1 standard deviation above the average for that region.

Censor Data

It may be desirable to exclude extreme positive and negative spikes from the time series for each region, as they may reflect some sort of artifact. Additionally, there may be some other source of information that might be used to identify potentially problematic measurements (e.g., motion correction data, or a list of volumes identified in SPM when using the ArtRepair toolbox).

A common practice when looking at behavioral data (e.g., reaction times) is to use the deviation from the mean as a threshold for identifying extreme values. These values can be either replaced with the cutoff value or else scrubbed from the data entirely. The following code will do either, depending on what is designated as the replacement value:

replacement=NaN; %could instead have used replacement==thresh
thresh=2.5; %could use a more-or-less stringent threshold
zrun05(zrun05>thresh)=replacement;
zrun05(zrun05<-thresh)=-replacement;

Save Your Data

Whatever steps you have chosen to clean up your data, it's probably wise to use MATLAB's save() function to save your scrubbed data (and possibly data along different stages of scrubbing) to a .mat file for future use.

%% for more information on how to use the save() function:
help save   
save('T1_501_timeseries.mat', 'zrun05', 'zrun06');

The default format is MATLAB's native .mat file format.

Calculate Correlation Matrix

The MATLAB corrcoef function will calculate an NxN correlation matrix, where N=number of brain regions in the parcellation (1000 in our case). As mentioned previously, the Pearson correlation is equivalent to the zero-lag cross-correlation between the time series.

[R05,P05]=corrcoef(zrun05, 'rows', 'pairwise');

If you have NaN values in your data, be sure to use the 'pairwise' directive. After you have calculated your correlation matrix (especially if n-lag cross-correlations were iteratively pairwise-calculated), you may wish to append these data to your previously saved data. This will save redoing time-consuming steps should you need to quit and restart at a later time. Appending variables to an existing .mat file is accomplished by using the save() function with the '-append' parameter:

save('T1_501_timeseries.mat', '-append', 'P05', 'R05', 'P06', 'R06');