BASH Tricks
How many lines in my text file?
Totally useful when you have some kind of training file with many rows and columns:
FILENAME=myfile.csv nl ${FILENAME} | awk '{ print $1 }'
I want to drop the first line of my text file
tail
echoes the last n lines (default: 10) of a text file to stdout. Using the -n flag flips it around so that it echoes back all up to the last n lines of the file. So -n +2 will echo back the file up to the 2nd line of the file (i.e., dropping the first line). We can pipe this to a temp file (so we don't write out an empty file), and then rename:
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"
PROTIP: I'm pretty sure the same trick applies using the head
command to drop the last n lines from a file.
Make a list of directory names
We often organize subject data so that each subject gets their own directory. Freesurfer uses a subjects file when batch processing. Rather than manually type out each folder name into a text file, it can be generated in one line of code:
ls -1 -d */ | sed "s,/$,," > subjects
This lists in 1 column all the directories (-1 -d) and uses sed to snip off the trailing forward slashes in the directory names
Making a tar
archive containing only the minimal fileset for FreeSurfer
FreeSurfer makes a zillion files. I have no idea what most of them are there for. Which do we need? The absolute minimal list is a work in progress, but I have made a file called fsaverage.required (copied to the ubfs scripts directory) based on the contents of the fsaverage template subject directory contents. I dropped the obvious directories (e.g., mri 2mm), so what's left should hopefully be close to the minimal required set for getting things done with a subject. The idea is to reduce the number of superfluous files that you store or copy over the network so that we don't waste as much time and disk space with useless nonsense.
So here's what you do:
- Copy fsaverage.required to
$SUBJECTS_DIR
- Inspect fsaverage.required to make sure that it has any idiosyncratic files that you might wish to include
- e.g., the original version only includes f.nii.gz files. If you want to also grab all your preprocessed .mgz files, then you'll want to include *.mgz up at the top. Save any changes.
- Navigate to a subject directory:
cd FS_sub-001
- The following command will use the files listed in
$SUBJECTS_DIR/fsaverage.required
to find and archive the desired files for this subject:tar -czvf sub-001.minimal.tgz `find . | grep -G -f ${SUBJECTS_DIR}/fsaverage.required`
- When you're done, you'll have the bare-bones minimum files to permit FS-FAST analyses of your BOLD data for your subject.
- You can copy the .tgz files to an external drive or over the network. Be sure to unpack the .tgz archive in an empty subject directory
- e.g.:
mkdir ~/new_project/ #starting a new project directory - in this case on the same computer, but it could be anywhere cd new_project #enter the new project directory mkdir FS_sub-001 #making an empty subject directory for the files we're about to unpack cd FS_sub-001 #navigate into the new empty subject directory #next line copies the minimal file archive from the source directory into the new empty subject directory cp ~/${SUBJECTS_DIR}/FS_sub-001/sub-001.minimal.tgz ./ #next line unzips the file archive into the empty directory tar -xzvf sub-001.minimal.tgz
Make a series of numbered directories
FreeSurfer BOLD data goes in a series of directories, numbered 001, 002, ... , 0nn. A one-liner of code to create these directories in the command line:
for i in $(seq -f "%03g" 1 6); do mkdir ${i}; done #this will create directories 001 to 006. Obviously, if you need more directories, change the second value from 6 to something else
Protip: If you want to also make the runs
file that some of our scripts use at the same time, the above snippet can be modified:
for i in $(seq -f "%03g" 1 6); do mkdir ${i}; echo ${i} >> runs; done
Restart Window Manager
This has happened a couple times before: you step away from the computer for awhile (maybe even overnight) and when you come back, you find it is locked up and completely unresponsive. The nuclear option is to reboot the whole machine:
sudo shutdown -r now #Sad for anyone running autorecon or a neural network
Unfortunately, that will stop anything that might be running in the background. A less severe solution might be to just restart the window manager. To do this you will need to ssh into the locked-up computer from a different computer, and then restart the lightdm process. This will require superuser privileges.
ssh hostname
Then after you have connected to the frozen computer:
sudo restart lightdm
Any processes that were dependent on the window manager will be terminated (e.g., so if you had been in the middle of editing labels in tksurfer, you will find that tksurfer has been shutdown and you will need to start over), however anything that was running in the background (e.g., autorecon) should be unaffected.
Renaming Multiple Files
Rename Using rename
A perl command, called rename
might be available on your *nix system:
rename [OPTIONS] perlexpr files
Among useful options are the -n
flag, which just reports what all the file renames would be, but doesn't actually execute them.
A handy application of rename is to hide files and/or directories. Files with names beginning with a dot are hidden by default and don't show up in directory listings. This can be a handy way of excluding chunks of data from your scripts.
Use-Case: Hiding Session 2 Data
In our Multisensory Imagery experiment, we collect 6 runs at time points 1 and 2. If we wish to be able to analyze all the data, these would be stored together as runs 001 to 012. Suppose we wish to temporarily hide the second time point data:
rename -n 's/01/\._01/' `find ./ -type d -name "01*"`
This would find all the directories ("-type -d") named 01*, then it would show you how it would rename them. If everything looked right, you would execute the same command again, but omit the -n flag so that the renaming actually takes place. Note that this example only gets the 010, 011 and 012 directories. You would do something similar for directories 00[6-9].
Use-Case: Unhiding Directories
This one is easier, since all the hidden directories start with "._" using the approach described above:
rename 's/\._//' `find ./ -type d -name "._0*"`
In case you're curious about the syntax of the perl expression, you might want to read up a bit about regular expressions, but in this case, 's/\._//' indicates we are doing a substitution that will replace every instance of ._ with an empty string (//). The extra back-slash in front of the period is an escape character, which is needed because otherwise the dot (period) will be interpreted as a special character.
Rename Using mv
If you don't have access to the rename command (Mac OSX), you can fake it:
PREFIX=LO for file in `find . -name "*.txt"`; do mv ${file##*/} ${PREFIX}_${file##*/}; done
Source: [1]
Related Trick: Collecting and Renaming Multiple Files in Subdirectories
Use case: I ran a bunch of model simulations. Each batch of simulations produced a series of 8 Keras files named model_0x.h5, and stored in directories named batch_##/. 10 batches of simulations produced 80 model files, except that they all had the same names. I wanted to run some tests on the complete set, so I needed to aggregate all the files in a single directory, but rename them from 01 to 80:
for run in $(seq 1 10) do r=`printf "%02d" $run` echo "Gathering run $r files" for m in {1..8} do basemodel=`printf "%02d" $m` blockstart=$(( ($run-1)*8 )) newmodel=$(( $blockstart+$m )) cp batch_$r/model_$basemodel.h5 ./model_$newmodel.h5 done done
sed
Tricks
Replacing Text in Multiple Files
sed -i 's/oldtext/newtext/g' *.ext
Remove punctuation and convert to lowercase
$FILENAME=file.txt sed 's/[[:punct:]]//g' $FILENAME | sed $'s/\t//g' | tr '[:upper:]' '[:lower:]' > lowercase.$FILENAME
Archiving Specific Files in a Directory Tree
The tar
has an --include
switch which will archive only matching file patterns, however it appears that this filtering breaks when trying to archive files in subdirectories. Fortunately, the person who posed the question on StackExchange already had a workaround that works fine (it's just ugly):
find ./ -name "*.wav.txt" -print0 | tar -cvzf ~/adhd.tgz --null -T -
No idea what the -T does, nor what the trailing - does, but there you have it. This works. Just replace your file pattern with whatever it is you're filtering out, and of course specify an appropriate tgz archive name.
mysql on the terminal
So I learned tonight how to export query results to a text file from the shell interface. Note that MySQL server is running with the --secure-file-priv option enabled, so you can't just willy-nilly write files wherever you want. However /var/lib/mysql-files/ is fair game, so for example:
select * from conceptstats inner join concepts on conceptstats.concid=concepts.concid where pid=183 and norm=1 into outfile '/var/lib/mysql-files/0183.txt'