Dienstag, 24. November 2015

My best shell script bug so far :)

Consider this Bourne shell script code:
if "$NEW_STATE" = "false"; then
    echo "Current failed."
What's wrong with it?

I had this in a script and spent quite some time figuring out what was wrong as it kept writing "Current failed." though indeed nothing had failed. Having stared on it for several minutes, that piece of code seemed perfectly fine.

Again after some time and several trace and debug statements later I knew that the "$NEW_STATE" variable's value was indeed "true". So what was wrong?

Well, I had meant to write this:
if [ "$NEW_STATE" = "false" ]; then
    echo "Current failed."
The shell does not evaluate expressions itself. It just calls commands, such like "test" (for which "[" is simply a copy, symlink or alias), but also "grep", "ls", "cp", whatever  you like. "true" is whatever returns 0.

Having forgotten the "[" I produced code that ran perfectly well - while just doing the opposite of what I had intended. Why did I not even get a syntax error, as the shell itself cannot handle string comparison itself?

The answer is simple: both "true" and "false" are commands themselves. Both take no arguments but don't complain either if any are given - arguments are just being ignored. My "$NEW_STATE" variable had a value of "true" because everything was fine. So my broken comparison expanded to this here:
if "true" = "false"; then
    echo "Current failed."
The shell happily executes "true" which yields 0 (which is true for the shell) and does not bother to look at its arguments "= "false"". Good fun.