Why do shell script comparisons often use x$VAR = xyes?

I see this often in the build scripts of projects that use autotools (autoconf, automake). When somebody wants to check the value of a shell variable, they frequently use this idiom:

if test "x$SHELL_VAR" = "xyes"; then
...

What is the advantage to this over simply checking the value like this:

if test $SHELL_VAR = "yes"; then
...

I figure there must be some reason that I see this so often, but I can't figure out what it is.

27190 次浏览

I used to do that in DOS when the SHELL_VAR might be undefined.

I believe its due to

SHELLVAR=$(true)
if test $SHELLVAR  = "yes" ; then echo "yep" ; fi


# bash: test: =: unary operator expected

as well as

if test $UNDEFINEDED = "yes" ; then echo "yep" ; fi
# bash: test: =: unary operator expected

and

SHELLVAR=" hello"
if test $SHELLVAR = "hello" ; then echo "yep" ; fi
# yep

however, this should usually work

SHELLVAR=" hello"
if test "$SHELLVAR" = "hello" ; then echo "yep" ; fi
#<no output>

but when it complains in output somewhere else, its hard to tell what its complaining about I guess, so

SHELLVAR=" hello"
if test "x$SHELLVAR" = "xhello" ; then echo "yep" ; fi

works just as well, but would be easier to debug.

If you don't do the "x$SHELL_VAR" thing, then if $SHELL_VAR is undefined, you get an error about "=" not being a monadic operator or something like that.

If you're using a shell that does simple substitution and the SHELL_VAR variable does not exist (or is blank), then you need to watch out for the edge cases. The following translations will happen:

if test $SHELL_VAR = yes; then        -->  if test = yes; then
if test x$SHELL_VAR = xyes; then      -->  if test x = xyes; then

The first of these will generate an error since the fist argument to test has gone missing. The second does not have that problem.

Your case translates as follows:

if test "x$SHELL_VAR" = "xyes"; then  -->  if test "x" = "xyes"; then

The x, at least for POSIX-compliant shells, is actually redundant since the quotes ensue that both an empty argument and one containing spaces are interpreted as a single object.

There's two reasons that I know of for this convention:

http://tldp.org/LDP/abs/html/comparison-ops.html

In a compound test, even quoting the string variable might not suffice. [ -n "$string" -o "$a" = "$b" ] may cause an error with some versions of Bash if $string is empty. The safe way is to append an extra character to possibly empty variables, [ "x$string" != x -o "x$a" = "x$b" ] (the "x's" cancel out).

Second, in other shells than Bash, especially older ones, the test conditions like '-z' to test for an empty variable did not exist, so while this:

if [ -z "$SOME_VAR" ]; then
echo "this variable is not defined"
fi

will work fine in BASH, if you're aiming for portability across various UNIX environments where you can't be sure that the default shell will be Bash and whether it supports the -z test condition, it's safer to use the form if [ "x$SOME_VAR" = "x" ] since that will always have the intended effect. Essentially this is an old shell scripting trick for finding an empty variable, and it's still used today for backwards compatibility despite there being cleaner methods available.

The other reason that no-one else has yet mentioned is in relation to option processing. If you write:

if [ "$1" = "abc" ]; then ...

and $1 has the value '-n', the syntax of the test command is ambiguous; it is not clear what you were testing. The 'x' at the front prevents a leading dash from causing trouble.

You have to be looking at really ancient shells to find one where the test command does not have support for -n or -z; the Version 7 (1978) test command included them. It isn't quite irrelevant - some Version 6 UNIX stuff escaped into BSD, but these days, you'd be extremely hard pressed to find anything that ancient in current use.

Not using double quotes around values is dangerous, as a number of other people pointed out. Indeed, if there's a chance that file names might contain spaces (MacOS X and Windows both encourage that to some extent, and Unix has always supported it, though tools like xargs make it harder), then you should enclose file names in double quotes every time you use them too. Unless you are in charge of the value (e.g. during option handling, and you set the variable to 'no' at startup and 'yes' when a flag is included in the command line) then it is not safe to use unquoted forms of variables until you've proved them safe -- and you may as well do it all the time for many purposes. Or document that your scripts will fail horribly if users attempt to process files with blanks in the names. (And there are other characters to worry about too -- backticks could be rather nasty too, for instance.)

I recommend instead:

if test "yes" = "$SHELL_VAR"; then

since it does away with the ugly x, and still solves the problem mentioned by https://stackoverflow.com/a/174288/895245 that $SHELL_VAR may start with - and be read as an option.