Adventures in Code IV: correcting filenames

A large amount of time doing data analysis is the process of cleaning, importing, reorganising and generally not actually analysing data but getting it ready to analyse. I’ve been trying to get over the idea to non-coders in the group that strict naming conventions (for example) are important and very helpful to the poor person who has to deal with the data.

missingplot

Things have improved a lot and dtatsets that used to take a few hours to clean up are now pretty much straightforward. A recent example is shown here. Almost 200 subconditions are plotted out and there is only one missing graph. I suspect the blood sugar levels were getting low in the person generating the data… the cause was a hyphen in the filename and not an underscore.

These data are read into Igor from CSVs outputted from Imaris. Here comes the problem: the folder and all files within it have the incorrect name.

There are 35 files in each folder and clearly this needs a computer to fix, even if it were just one foldersworth at fault. The quickest way is to use the terminal and there are lots of ways to do it.

Now, as I said the problem is that the foldername and filenames both need correcting. Most terminal commands you can quickly find online actually fail because they try to rename the file and folder at the same time, and since the folder with the new name doesn’t exist… you get an error.

The solution is to rename the folders first and then the files.


find . -type d -maxdepth 2 -name "oldstring*" | while read FNAME; do mv "$FNAME" "${FNAME//oldstring/newstring}"; done
find . -type f -maxdepth 3 -name "oldstring*.csv" | while read FNAME; do mv "$FNAME" "${FNAME//oldstring/newstring}"; done

A simple tip, but effective and useful. HT this gist

Part of a series on computers and coding