Downloading CRTC Data

From CCN Wiki
Jump to navigation Jump to search

Our fMRI data is typically posted to the CTRC Owncloud website within one or two days after acquisition (possibly longer if the scan was a Friday). This page describes how to download and organize the data for the first time, and then upload the organized raw data files to the UBFS network directory where any team member can then access them. Note that the data stored on UBFS is intended to be copied to your local computer hard drive, where you can mangle it to your heart's content without worrying about affecting anyone else's projects or corrupting our only copy of the data.

New

I should look into this mriqc project. It'll require reorganizing how we store the data.

These directions assume that you have the URL and login/password to access the Owncloud website.

SemCat URL: http://tinyurl.com/blobs-catburgers

LDT URL: http://tinyurl.com/lexitron5000.

Identify the Subjects

Internally, we use our own set of Subject IDs, which are generated when a participant enters our experiment. The CTRC does the same thing, meaning that our subject numbers do not line up with those at the CTRC. Moreover, the same participant scanned multiple times will have a new CTRC subject number for each scan. It is imperative that you be able to translate between our subject numbers and those assigned to the fMRI data by the CTRC!

The timestamps for each zipped data set are visible through the Owncloud interface. As a rule of thumb, the most recently acquired data will also be the one with the most recent timestamp. If there is really only one candidate file, it should be fairly easy to identify the files you want. If you need to download older data, or if there are multiple candidates, you can consult the fMRI Log File Google Spreadsheet, which lists the CTRC subject number along with our own internal subject number.

Download the Data

Each fMRI session has two associated zip files. You can ignore the ones labeled study_files, as those are stored in a proprietary data format and are useless to us. Instead, you will want to download the NIFTI files to your local hard drive.

File Organization

Once the download has finished, you can unzip the files, and copy them to a local directory. Give this directory a name corresponding to our local subject number. For example, if the data belongs to our subject 201, create directory 0201, and move the files into the new directory. The CTRC assigns sequential numbers and other file information to each of these files, and so we generally rename them according to our NIFTI file organization and naming convention.

An important step here is to check whether all expected files are present. We are generally expecting to see the following key files:

  1. A file with MPRAGE appearing somewhere in the filename. This is the high-resolution (~1mm voxels) anatomical image. It will likely begin with the prefix 500 or 501, and should be roughly 20MB in size.
    • Create a subdirectory called mri and move this file to the new subfolder, renaming it to "MPRAGE.nii.gz" only change the first part of the filename, leave the extension as is.
  2. A set of files with BOLD appearing somewhere in the filename. These files may be prefixed with numbers in the range of 700, 800, 900, etc..
    • These are the most likely to be problematic because sometimes a run will have to be aborted (e.g., the experiment malfunctioned) meaning there will be some dud files.
    • These files should be somewhere in the range of 50MB
    • There should be exactly one of these files for each run that the participant completed. If there are extra files, it is likely that a run was restarted. If there are fewer, this is more troubling, since that means we are missing some data, and this needs to be followed up on with the CTRC (it also makes it ambiguous which runs have missing data)
    • Our LDT and Semantic Imagery experiment have 6 runs. If we have 6 full runs, all is good in the hood.
    • If you are confident that you know that the data are complete, go ahead and create a directory called bold. Under that directory, create run directories called 001, 002, ... etc. (one for each run). Move each of the BOLD files to these run directories, and rename to f.nii.gz
    • For some of the sessions participants complete a test run. Make sure that when you are moving the bold files in to their corresponding folders that your are not starting with these test run BOLD files.
  3. The scan session may include a T2- or FLAIR image, which helps identify gray matter. The unzipped files may include a file with FLAIR or 3D_T2 in the name. Expect a FLAIR file to be approximately 20MB, and the 3D_T2 file to be over 100MB in sise.
    • If these files exist, copy them to the /mri subdirectory you created in Step 1
  • All other files not already moved into the /mri or /bold/xxx directories can go into a third subdirectory called /other

Multisession Data

Our Semantic Imagery experiment has two sets of data for some participants. We have been adding the _SESS_1 and _SESS_2 suffix to the subject numbers for these data, and otherwise treating the data as though they belong to completely different subjects. For example, the raw data folder contains 0183_SESS_1 and 0183_SESS_2.

If a team member wishes to analyze the data for both sessions at once, it will be incumbent on that individual to organize and possibly rename their own local copies of the data. For example, each session of Semantic Imagery should have 6 runs of BOLD files. When analyzing both sessions, as I copy over the SESS_1 data, I rename the MPRAGE file to MPRAGE_1, but otherwise leave the file/folder names intact. Then when I copy over the MPRAGE and the BOLD files for SESS_2, I rename the MPRAGE to MPRAGE_2, and I rename the BOLD folders from 001, 002, ..., 006 to 007, 008, ..., 012, reflecting the fact that the individual has 12 total runs of BOLD data.

Note also that preprocessing multisession data requires a bit of extra work to make sure that the files are all coregistered to the same space. This has nothing to do with how the files are organized; it's just mentioned here to put it on your radar.

File Archiving

At this point, the raw data have been organized according to the directory structure below:

  • Subject Number
    • mri
    • bold
      • 001
      • 002
      • etc.
    • other

If this is correct, you can now upload the data to the raw data archive on ubfs/openfmri/ that corresponds to the project that this data belongs to. Be sure to make sure that your folder name matches the format of the existing folders you are moving the data too.

Runtime Data

This is a good occasion to make sure that the MATLAB runtime files for this particular session have also been taken off the Experimenter laptop and are also in the project directory on UBFS.