python subprocess Popen environment PATH?

Suppose there's an executable and a Python script to launch it, and they're located in 'sibling' subdirectories, e.g.

/tmp/subdir1/myexecutable
/tmp/subdir2/myscript.py

If in /tmp and running python subdir2/myscript.py with a relative path to executable

# myscript.py
from subprocess import Popen
proc = Popen(["../subdir1/myexecutable"])

It makes OSError: [Errno 2] No such file or directory.

How does the Python search for the executable? Does it use the current working directory and/or location of the script? Does it use PATH and/or PYTHONPATH? Can you change where and how subprocess.Popen searches for the executable? Are commands, absolute and relative paths for executables treated differently? Are there differences between Linux and Windows? What does shell=True or shell=False influence?

92393 次浏览

The pythonpath is set to the path from where the python interpreter is executed. So, in second case of your example, the path is set to /dir and not /dir/subdir2 That's why you get an error.

You appear to be a little confused about the nature of PATH and PYTHONPATH.

PATH is an environment variable that tells the OS shell where to search for executables.

PYTHONPATH is an environment variable that tells the Python interpreter where to search for modules to import. It has nothing to do with subprocess finding executable files.

Due to the differences in the underlying implementation, subprocess.Popen will only search the path by default on non-Windows systems (Windows has some system directories it always searches, but that's distinct from PATH processing). The only reliable cross-platform way to scan the path is by passing shell=True to the subprocess call, but that has its own issues (as detailed in the Popen documentation)

However, it appears your main problem is that you are passing a path fragment to Popen rather than a simple file name. As soon as you have a directory separator in there, you're going to disable the PATH search, even on a non-Windows platform (e.g. see the Linux documentation for the exec family of functions).

A relative path in subprocess.Popen acts relative to the current working directory, not the elements of the systems PATH. If you run python subdir2/some_script.py from /dir then the expected executable location (passed to Popen) will be /dir/../subdir1/some_executable, a.k.a /subdir1/some_executable not /dir/subdir1/some_executable.

If you would definitely like to use relative paths from a scripts own directory to a particular executable the best option would be to first construct an absolute path from the directory portion of the __file__ global variable.

#/usr/bin/env python
from subprocess import Popen, PIPE
from os.path import abspath, dirname, join
path = abspath(join(dirname(__file__), '../subdir1/some_executable'))
spam, eggs = Popen(path, stdout=PIPE, stderr=PIPE).communicate()

Relative paths (paths containing slashes) never get checked in any PATH, no matter what you do. They are relative to the current working directory only. If you need to resolve relative paths, you will have to search through the PATH manually.

If you want to run a program relative to the location of the Python script, use __file__ and go from there to find the absolute path of the program, and then use the absolute path in Popen.

Searching in the current process' environment variable PATH

There is an issue in the Python bug tracker about how Python deals with bare commands (no slashes). Basically, on Unix/Mac Popen behaves like os.execvp when the argument env=None (some unexpected behavior has been observed and noted at the end):

On POSIX, the class uses os.execvp()-like behavior to execute the child program.

This is actually true for both shell=False and shell=True, provided env=None. What this behavior means is explained in the documentation of the function os.execvp:

The variants which include a “p” near the end (execlp(), execlpe(), execvp(), and execvpe()) will use the PATH environment variable to locate the program file. When the environment is being replaced (using one of the exec*e variants, discussed in the next paragraph), the new environment is used as the source of the PATH variable.

For execle(), execlpe(), execve(), and execvpe() (note that these all end in “e”), the env parameter must be a mapping which is used to define the environment variables for the new process (these are used instead of the current process’ environment); the functions execl(), execlp(), execv(), and execvp() all cause the new process to inherit the environment of the current process.

The second quoted paragraph implies that execvp will use the current process' environment variables. Combined with the first quoted paragraph, we deduce that execvp will use the value of the environment variable PATH from the environment of the current process. This means that Popen looks at the value of PATH as it was when Python launched (the Python that runs the Popen instantiation) and no amount of changing os.environ will help you fix that.

Also, on Windows with shell=False, Popen pays no attention to PATH at all, and will only look in relative to the current working directory.

What shell=True does

What happens if we pass shell=True to Popen? In that case, Popen simply calls the shell:

The shell argument (which defaults to False) specifies whether to use the shell as the program to execute.

That is to say, Popen does the equivalent of:

Popen(['/bin/sh', '-c', args[0], args[1], ...])

In other words, with shell=True Python will directly execute /bin/sh, without any searching (passing the argument executable to Popen can change this, and it seems that if it is a string without slashes, then it will be interpreted by Python as the shell program's name to search for in the value of PATH from the environment of the current process, i.e., as it searches for programs in the case shell=False described above).

In turn, /bin/sh (or our shell executable) will look for the program we want to run in its own environment's PATH, which is the same as the PATH of the Python (current process), as deduced from the code after the phrase "That is to say..." above (because that call has shell=False, so it is the case already discussed earlier). Therefore, the execvp-like behavior is what we get with both shell=True and shell=False, as long as env=None.

Passing env to Popen

So what happens if we pass env=dict(PATH=...) to Popen (thus defining an environment variable PATH in the environment of the program that will be run by Popen)?

In this case, the new environment is used to search for the program to execute. Quoting the documentation of Popen:

If env is not None, it must be a mapping that defines the environment variables for the new process; these are used instead of the default behavior of inheriting the current process’ environment.

Combined with the above observations, and from experiments using Popen, this means that Popen in this case behaves like the function os.execvpe. If shell=False, Python searches for the given program in the newly defined PATH. As already discussed above for shell=True, in that case the program is either /bin/sh, or, if a program name is given with the argument executable, then this alternative (shell) program is searched for in the newly defined PATH.

In addition, if shell=True, then inside the shell the search path that the shell will use to find the program given in args is the value of PATH passed to Popen via env.

So with env != None, Popen searches in the value of the key PATH of env (if a key PATH is present in env).

Propagating environment variables other than PATH as arguments

There is a caveat about environment variables other than PATH: if the values of those variables are needed in the command (e.g., as command-line arguments to the program being run), then even if these are present in the env given to Popen, they will not get interpreted without shell=True. This is easily avoided without changing shell=True: insert those value directly in the list argument args that is given to Popen. (Also, if these values come from Python's own environment, the method os.environ.get can be used to get their values).

Using /usr/bin/env

If you JUST need path evaluation and don't really want to run your command line through a shell, and are on UNIX, I advise using env instead of shell=True, as in

path = '/dir1:/dir2'
subprocess.Popen(['/usr/bin/env', '-P', path, 'progtorun', other, args], ...)

This lets you pass a different PATH to the env process (using the option -P), which will use it to find the program. It also avoids issues with shell metacharacters and potential security issues with passing arguments through the shell. Obviously, on Windows (pretty much the only platform without a /usr/bin/env) you will need to do something different.

About shell=True

Quoting the Popen documentation:

If shell is True, it is recommended to pass args as a string rather than as a sequence.

Note: Read the Security Considerations section before using shell=True.

Unexpected observations

The following behavior was observed:

  • This call raises FileNotFoundError, as expected:

    subprocess.call(['sh'], shell=False, env=dict(PATH=''))
    
  • This call finds sh, which is unexpected:

    subprocess.call(['sh'], shell=False, env=dict(FOO=''))
    

    Typing echo $PATH inside the shell that this opens reveals that the PATH value is not empty, and also different from the value of PATH in the environment of Python. So it seems that PATH was indeed not inherited from Python (as expected in the presence of env != None), but still, it the PATH is nonempty. Unknown why this is the case.

  • This call raises FileNotFoundError, as expected:

    subprocess.call(['tree'], shell=False, env=dict(FOO=''))
    
  • This finds tree, as expected:

    subprocess.call(['tree'], shell=False, env=None)