使用 Bash 按列拆分命令的输出?

我想这么做:

  1. 运行命令
  2. 捕捉输出
  3. 选一条线
  4. 选择该行中的一列

举个例子,假设我想从 $PID中获取命令名(请注意这只是一个例子,我并不是说这是从进程 id 中获取命令名最简单的方法——我真正的问题是另一个输出格式我无法控制的命令)。

如果我运行 ps,我得到:


PID TTY          TIME CMD
11383 pts/1    00:00:00 bash
11771 pts/1    00:00:00 ps

现在我做 ps | egrep 11383然后得到

11383 pts/1    00:00:00 bash

Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:

<absolutely nothing/>

问题是 cut将输出减少了一个空格,当 ps在第2列和第3列之间添加一些空格以保持表的相似性时,cut选择了一个空字符串。当然,我可以使用 cut来选择第7个字段,而不是第4个字段,但我怎么知道,特别是当输出是可变的和未知的事先。

209843 次浏览

Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:

command|head -n 6|tail -n 1|awk '{print $4}'

I think the simplest way is to use awk. Example:

$ echo "11383 pts/1    00:00:00 bash" | awk '{ print $4; }'
bash

One easy way is to add a pass of tr to squeeze any repeated field separators out:

$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4

try

ps |&
while read -p first second third fourth etc ; do
if [[ $first == '11383' ]]
then
echo got: $fourth
fi
done

Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.

ps -o cmd= -p 12345

You get the cmmand line of a process with the pid specified and nothing else.

This is POSIX-conformant and may be thus considered portable.

Using array variables

set $(ps | egrep "^11383 "); echo $4

or

A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}

Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...

$ ps h -o pid,user -C ssh,sshd | tr -s " "
1543 root
19645 root
19731 root

Then cutting will result in a blank line for some of those fields if it is the first column:

$ <previous command> | cut -d ' ' -f1


19645
19731

Unless you precede it with a space, obviously

$ <command> | sed -e "s/.*/ &/" | tr -s " "

Now, for this particular case of pid numbers (not names), there is a function called pgrep:

$ pgrep ssh


Shell functions

However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:

$ <command> | while read a b; do echo $a; done

The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.

So,

while read a b c d; do echo $c; done

will then output the 3rd column. As indicated in my comment...

A piped read will be executed in an environment that does not pass variables to the calling script.

out=$(ps whatever | { read a b c d; echo $c; })


arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]}     # will output 'b'`


The Array Solution

So we then end up with the answer by @frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).

Your command

ps | egrep 11383 | cut -d" " -f 4

misses a tr -s to squeeze spaces, as unwind explains in his answer.

However, you maybe want to use awk, since it handles all of these actions in a single command:

ps | awk '/11383/ {print $4}'

This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.

Similar to brianegge's awk solution, here is the Perl equivalent:

ps | egrep 11383 | perl -lane 'print $F[3]'

-a enables autosplit mode, which populates the @F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.

Field 3 is printed since Perl starts counting from 0 rather than 1

Bash's set will parse all output into position parameters.

For instance, with set $(free -h) command, echo $7 will show "Mem:"