How to avoid pitfalls when automating tasks with crontab
I’ve found it useful to schedule tasks (i.e. scripts) to run automatically at specific times. Doing it on my local computer is actually not ideal, since the computer must be running at the scheduled time - but it’s good enough for me.
One way of doing it on a Mac is to use crontab and there are plenty of “How to” guides out there - however I found it surprisingly cumbersome so get it to work, so I will write up my own dos and donts. The hard part was basically that testing and debugging was difficult.
Sudo or not sudo
As it turns out for me I only got it to work when I used:
sudo crontab -e
Enclose task in shell script or not?
Most guides tells you to write up your task in a .sh script and then just run the shell script in crontab. I think this make a lot of sense for more complicated tasks, but for simple stuff like running a python script, why bother? It works just as fine.
Making shell scripts executable
If you go with the shell script. Make sure you make it executable first.
chmod 755 crontask.sh
And then to execute it, it is safest to write the absolute path to it inside crontab. An alternative is to use
cd /path/to/taskdir && ./crontask.sh
NB include the ./
part!
Absolute paths are absolutely necessary
My biggest headache was to figure out how to deal with paths whenever I wanted to output something from the task. As it turns out absolute paths in all scripts is a very simple way of avoiding problems.
In particular if you use python scripts, in contrast to running the script in a ordinary session the dir of your script will not be the cwd. If this is what you wanna mimic this use for instance:
import sys
from pathlib import Path
root = Path(sys.argv[0]).parent
outfile = root / "outfile.log"
Specifying Path in crontab
I needed to run my python script in a particular conda environment, which posed some challenges. As it turned out by starting the crontab script with
PATH="/path/to/conda/env/envname/bin"
solves this. To find path to conda env use:
conda env list
After that it is safe to just use python path/to/pythonscript/main.py
in crontab.
BUT this caused new problems since then the PATH for the whole crontab script will be this, and no other. Thus in order to use standard shell commands I ended up with this:
PATH="/path/to/conda/env/envname/bin:/bin:/opt/homebrew/bin:/usr/bin/:/usr/local/bin/"
To identify the path need figure out what commands you need for the task and just do which <commandname>
in the terminal.
Filename with datestamps
A big surprise was that you need to escape all the % symbols in your crontab statement. So use \%
when you wanna ad a timestamp to a filename, like a logfile.
./crontask.sh >> /path/to/logdir/$(date +\%Y\%m\%d).log
Output error message to a logfile
This was particularly important when debugging and figuring out what was not working. In order to do this use 2>&1
like this:
./crontask.sh >> /path/to/logdir/$(date +\%Y\%m\%d).log 2>&1
Then you can get a sense of what went wrong.