Downloading CRTC Data: Difference between revisions

From CCN Wiki
Jump to navigation Jump to search
Line 24: Line 24:
#*If these files exist, copy them to the /mri subdirectory you created in Step 1
#*If these files exist, copy them to the /mri subdirectory you created in Step 1
*All other files not already moved into the /mri or /bold/xxx directories can go into a third subdirectory called /other
*All other files not already moved into the /mri or /bold/xxx directories can go into a third subdirectory called /other
===Multisession Data===
Our Semantic Imagery experiment has two sets of data for some participants. We have been adding the _SESS_1 and _SESS_2 suffix to the subject numbers for these data, and otherwise treating the data as though they belong to completely different subjects. If a team member wishes to analyze the data for both sessions at once, it will be incumbent on that individual to organize and possibly rename their own local copies of the data. For example, each session of Semantic Imagery should have 6 runs of BOLD files. When analyzing both sessions, as I copy over the SESS_1 data, I rename the MPRAGE file to MPRAGE_1, but otherwise leave the file/folder names intact. Then when I copy over the MPRAGE and the BOLD files for SESS_2, I rename the MPRAGE to MPRAGE_2, and I rename the BOLD folders from 001, 002, ..., 006 to 007, 008, ..., 012, reflecting the fact that the individual has 12 total runs of BOLD data.
Note also that preprocessing multisession data requires a bit of extra work to make sure that the files are all coregistered to the same space. This has nothing to do with how the files are organized; it's just mentioned here to put it on your radar.


==File Archiving==
==File Archiving==

Revision as of 15:31, 29 January 2018

Our fMRI data is typically posted to the CTRC Owncloud website within one or two days after acquisition (possibly longer if the scan was a Friday). These directions assume that you have the URL and login/password to access the Owncloud website.

Identify the Subjects

Internally, we use our own set of Subject IDs, which are generated when a participant enters our experiment. The CTRC does the same thing, meaning that our subject numbers do not line up with those at the CTRC. Moreover, the same participant scanned multiple times will have a new CTRC subject number for each scan. It is imperative that you be able to translate between our subject numbers and those assigned to the fMRI data by the CTRC!

The timestamps for each zipped data set are visible through the Owncloud interface. As a rule of thumb, the most recently acquired data will also be the one with the most recent timestamp. If there is really only one candidate file, it should be fairly easy to identify the files you want. If you need to download older data, or if there are multiple candidates, you can consult the fMRI Log File Google Spreadsheet, which lists the CTRC subject number along with our own internal subject number.

Download the Data

Each fMRI session has two associated zip files. You can ignore the ones labeled study_files, as those are stored in a proprietary data format and are useless to us. Instead, you will want to download the NIFTI files to your local hard drive.

File Organization

Once the download has finished, you can unzip the files, and copy them to a local directory. Give this directory a name corresponding to our local subject number. For example, if the data belongs to our subject 201, create directory 0201, and move the files into the new directory. The CTRC assigns sequential numbers and other file information to each of these files, and so we generally rename them according to our NIFTI file organization and naming convention.

An important step here is to check whether all expected files are present. We are generally expecting to see the following key files:

  1. A file with MPRAGE appearing somewhere in the filename. This is the high-resolution (~1mm voxels) anatomical image. It will likely begin with the prefix 500 or 501, and should be roughly 20MB in size.
    • Create a subdirectory called mri and move this file to the new subfolder, renaming it to MPRAGE.nii
  2. A set of files with BOLD appearing somewhere in the filename. These files may be prefixed with numbers in the range of 700, 800, 900, etc..
    • These are the most likely to be problematic because sometimes a run will have to be aborted (e.g., the experiment malfunctioned) meaning there will be some dud files.
    • These files should be somewhere in the range of 50MB
    • There should be exactly one of these files for each run that the participant completed. If there are extra files, it is likely that a run was restarted. If there are fewer, this is more troubling, since that means we are missing some data, and this needs to be followed up on with the CTRC (it also makes it ambiguous which runs have missing data)
    • Our LDT and Semantic Imagery experiment have 6 runs. If we have 6 full runs, all is good in the hood.
    • If you are confident that you know that the data are complete, go ahead and create a directory called bold. Under that directory, create run directories called 001, 002, ... etc. (one for each run). Move each of the BOLD files to these run directories, and rename to f.nii.gz
  3. The scan session may include a T2- or FLAIR image, which helps identify gray matter. The unzipped files may include a file with FLAIR or 3D_T2 in the name. Expect a FLAIR file to be approximately 20MB, and the 3D_T2 file to be over 100MB in sise.
    • If these files exist, copy them to the /mri subdirectory you created in Step 1
  • All other files not already moved into the /mri or /bold/xxx directories can go into a third subdirectory called /other

Multisession Data

Our Semantic Imagery experiment has two sets of data for some participants. We have been adding the _SESS_1 and _SESS_2 suffix to the subject numbers for these data, and otherwise treating the data as though they belong to completely different subjects. If a team member wishes to analyze the data for both sessions at once, it will be incumbent on that individual to organize and possibly rename their own local copies of the data. For example, each session of Semantic Imagery should have 6 runs of BOLD files. When analyzing both sessions, as I copy over the SESS_1 data, I rename the MPRAGE file to MPRAGE_1, but otherwise leave the file/folder names intact. Then when I copy over the MPRAGE and the BOLD files for SESS_2, I rename the MPRAGE to MPRAGE_2, and I rename the BOLD folders from 001, 002, ..., 006 to 007, 008, ..., 012, reflecting the fact that the individual has 12 total runs of BOLD data.

Note also that preprocessing multisession data requires a bit of extra work to make sure that the files are all coregistered to the same space. This has nothing to do with how the files are organized; it's just mentioned here to put it on your radar.

File Archiving

At this point, the raw data have been organized according to the directory structure below:

  • Subject Number
    • mri
    • bold
      • 001
      • 002
      • etc.
    • other

If this is correct, you can now upload the data to the raw data archive on ubfs/openfmri/ that corresponds to the project that this data belongs to.

Runtime Data

This is a good occasion to make sure that the MATLAB runtime files for this particular session have also been taken off the Experimenter laptop and are also in the project directory on UBFS.