Traps and signals in bash (II)
Restoring an old post from an old blog about using traps and signals in bash.
Continuing with a previous post, let’s see how to prevent multiple concurrent executions of a script.
Let’s start with this very complex script, which counts from 1 to 10:
#!/bin/bash
i=1
while [ $i -le 10 ]; do
echo $i
i=$((i+1))
sleep 1
done
Obviously, there is no problem in running it as many times as possible concurrently. For example:
$ ./nolock.sh & ./nolock.sh
[1] 19585
1
1
2
2
3
3
^C
Ugh, what a mess. No one can learn to count that way. It would be much better to prevent the script from running more than once. To do this, the “classic” method is to write a “lock” file. If the file exists, the script does not run; if it doesn’t exist, the script writes it, runs, and delete it later, to let another user execute it.
$ cat lock.sh
#!/bin/bash
LOCK="$HOME/lock.lck"
# If the file does not exist, write it and run:
if [ ! -e $LOCK ]; then
touch $LOCK
i=1
while [ $i -le 10 ]; do
echo $i
i=$((i+1))
sleep 1
done
/bin/rm $LOCK
else
echo "I'm already counting to 10"
fi
If the lock.lck
file does not exist, the script will create it and start to count; but if such a file exists, the script will complain and will not count anything. To test it, I execute the script in a terminal…
$ ./lock.sh
1
2
3
…and, in the meantime, I go to another terminal to try to run it again:
$ ./lock.sh
I'm already counting to 10
Right? Well no, too bad! Here we have what is known as race condition; and it is that in the interval between the existence of the file is checked and this one is created, it is possible that another script starts to execute and our method fails. Of course, the operation is not an atomic one. Don’t you think it is possible? Look how easy it is!
$ ./lock.sh & ./lock.sh
[1] 20099
1
1
2
2
3
3
4
4
Bash offers a mechanism that helps in these cases: the noclobber
option. Let’s see what man bash says:
If the redirection operator is >, and the noclobber option to the set builtin has been enabled, the redirection will fail if the file whose name results from the expansion of word exists and is a regular file. If the redirection operator is > , or the redirection operator is > and the noclobber option to the set builtin command is not enabled, the redirection is attempted even if the file named by word exists.
If ’noclobber’ is set and the redirect > operator is used, it will fail if the file exists. Let’s try this way then:
$ cat lock2.sh
#!/bin/bash
LOCK="$HOME/lock.lck"
# Si no existe el fichero, lo escribo y me ejecuto:
if ( set -o noclobber; echo "$$" > "$LOCK") 2> /dev/null; then
i=1
while [ $i -le 10 ]; do
echo $i
i=$((i+1))
sleep 1
done
/bin/rm $LOCK
else
echo "I'm already counting to 10"
fi
And trying to run it as before:
$ ./lock.sh & ./lock.sh
[1] 20178
1
I'm already counting to 10
$ 2
3
4
5
We see how one of the two scripts fails actually.
If we get tired of seeing how the script counts and we try to stop it (Ctrl+C), what will happen? The next time we want to launch it we won’t be able, because we will have left a lock file lingering. So, implementing what we saw in the above referenced post, we can do this:
$ cat lock3.sh
#!/bin/bash
LOCK="$HOME/lock.lck"
trap 'rm -f "$LOCK"; exit' INT TERM EXIT ERR
# If the file does not exist, write it and run:
if ( set -o noclobber; echo "$$" & "$LOCK") 2> /dev/null; then
i=1
while [ $i -le 10 ]; do
echo $i
i=$((i+1))
sleep 1
done
/bin/rm $LOCK
else
echo "I'm already counting to 10"
fi
Much better, isn’t it? Well, no!! Much worse!! Defining the trap in this way, a second script executed would not count, but would delete the lock file allowing a third script to be executed. The best alternative would probably be this:
$ cat ./lock2.sh
#!/bin/bash
LOCK="$HOME/lock.lck"
# If the file does not exist, write it and run:
if ( set -o noclobber; echo "$$" & "$LOCK") 2> /dev/null; then
trap 'rm -f "$LOCK"; exit' INT TERM EXIT ERR
i=1
while [ $i -le 10 ]; do
echo $i
i=$((i+1))
sleep 1
done
/bin/rm $LOCK
trap - INT TERM EXIT ERR
else
echo "I'm already counting to 10"
fi
This way if the script fails the traps are not yet redefined and the .lck
file is not deleted.
However it wouldn’t be a bad idea to check before deleting the lock file that it has really been written by the same script (that’s why the pid is written in the file). But well, that’s a topic for another post.