在 Unix 命令行上简洁而可移植的“ join”

如何使用新行字符所在的分隔符将多行合并到一行中,并避免使用尾随分隔符(可选地忽略空行) ?

例如,考虑一个文本文件 foo.txt,它有三行:

foo
bar
baz

理想的输出是:

foo,bar,baz

我现在使用的命令是:

tr '\n' ',' <foo.txt |sed 's/,$//g'

理想的情况是这样的:

cat foo.txt |join ,

什么是:

  1. 最便携、最简洁、最易读的方式。
  2. 使用非标准 Unix 工具的最简洁的方法。

我当然可以写点什么,或者只是用个化名,但我很想知道有哪些选择。

26983 次浏览

Perhaps a little surprisingly, paste is a good way to do this:

paste -s -d","

This won't deal with the empty lines you mentioned. For that, pipe your text through grep, first:

grep -v '^$' | paste -s -d"," -

Just for fun, here's an all-builtins solution

IFS=$'\n' read -r -d '' -a data < foo.txt ; ( IFS=, ; echo "${data[*]}" ; )

You can use printf instead of echo if the trailing newline is a problem.

This works by setting IFS, the delimiters that read will split on, to just newline and not other whitespace, then telling read to not stop reading until it reaches a nul, instead of the newline it usually uses, and to add each item read into the array (-a) data. Then, in a subshell so as not to clobber the IFS of the interactive shell, we set IFS to , and expand the array with *, which delimits each item in the array with the first character in IFS

This sed one-line should work -

sed -e :a -e 'N;s/\n/,/;ba' file

Test:

[jaypal:~/Temp] cat file
foo
bar
baz


[jaypal:~/Temp] sed -e :a -e 'N;s/\n/,/;ba' file
foo,bar,baz

To handle empty lines, you can remove the empty lines and pipe it to the above one-liner.

sed -e '/^$/d' file | sed -e :a -e 'N;s/\n/,/;ba'

I needed to accomplish something similar, printing a comma-separated list of fields from a file, and was happy with piping STDOUT to xargs and ruby, like so:

cat data.txt | cut -f 16 -d ' ' | grep -o "\d\+" | xargs ruby -e "puts ARGV.join(', ')"

Perl:

cat data.txt | perl -pe 'if(!eof){chomp;$_.=","}'

or yet shorter and faster, surprisingly:

cat data.txt | perl -pe 'if(!eof){s/\n/,/}'

or, if you want:

cat data.txt | perl -pe 's/\n/,/ unless eof'

Simple way to join the lines with space in-place using ex (also ignoring blank lines), use:

ex +%j -cwq foo.txt

If you want to print the results to the standard output, try:

ex +%j +%p -scq! foo.txt

To join lines without spaces, use +%j! instead of +%j.

To use different delimiter, it's a bit more tricky:

ex +"g/^$/d" +"%s/\n/_/e" +%p -scq! foo.txt

where g/^$/d (or v/\S/d) removes blank lines and s/\n/_/ is substitution which basically works the same as using sed, but for all lines (%). When parsing is done, print the buffer (%p). And finally -cq! executing vi q! command, which basically quits without saving (-s is to silence the output).

Please note that ex is equivalent to vi -e.

This method is quite portable as most of the Linux/Unix are shipped with ex/vi by default. And it's more compatible than using sed where in-place parameter (-i) is not standard extension and utility it-self is more stream oriented, therefore it's not so portable.

How about to use xargs?

for your case

$ cat foo.txt | sed 's/$/, /' | xargs

Be careful about the limit length of input of xargs command. (This means very long input file cannot be handled by this.)

I had a log file where some data was broken into multiple lines. When this occurred, the last character of the first line was the semi-colon (;). I joined these lines by using the following commands:

for LINE in 'cat $FILE | tr -s " " "|"'
do
if [ $(echo $LINE | egrep ";$") ]
then
echo "$LINE\c" | tr -s "|" " " >> $MYFILE
else
echo "$LINE" | tr -s "|" " " >> $MYFILE
fi
done

The result is a file where lines that were split in the log file were one line in my new file.

My answer is:

awk '{printf "%s", ","$0}' foo.txt

printf is enough. We don't need -F"\n" to change field separator.

POSIX shell:

( set -- $(cat foo.txt) ; IFS=+ ; printf '%s\n' "$*" )