Unix-文件中列的计数

给定一个包含如下数据的文件(即 stores.dat 文件)

sid|storeNo|latitude|longitude
2|1|-28.03720000|153.42921670
9|2|-33.85090000|151.03274200

输出列名数的命令是什么?

例如,在上面的例子中,它应该是4(第一行中管道字符的数量 + 1)

我在想:

awk '{ FS = "|" } ; { print NF}' stores.dat

但是它返回所有行而不仅仅是第一行,对于第一行,它返回1而不是4

224188 次浏览

Unless you're using spaces in there, you should be able to use | wc -w on the first line.

wc is "Word Count", which simply counts the words in the input file. If you send only one line, it'll tell you the amount of columns.

This is a workaround (for me: I don't use awk very often):

Display the first row of the file containing the data, replace all pipes with newlines and then count the lines:

$ head -1 stores.dat | tr '|' '\n' | wc -l
awk -F'|' '{print NF; exit}' stores.dat

Just quit right after the first line.

If you have python installed you could try:

python -c 'import sys;f=open(sys.argv[1]);print len(f.readline().split("|"))' \
stores.dat

This is usually what I use for counting the number of fields:

head -n 1 file.name | awk -F'|' '{print NF; exit}'

You could try

cat FILE | awk '{print NF}'

Perl solution similar to Mat's awk solution:

perl -F'\|' -lane 'print $#F+1; exit' stores.dat

I've tested this on a file with 1000000 columns.


If the field separator is whitespace (one or more spaces or tabs) instead of a pipe:

perl -lane 'print $#F+1; exit' stores.dat

Based on Cat Kerr response. This command is working on solaris

awk '{print NF; exit}' stores.dat

you may try:

head -1 stores.dat | grep -o \|  | wc -l

select any row in the file (in the example below, it's the 2nd row) and count the number of columns, where the delimiter is a space:

sed -n 2p text_file.dat | tr ' ' '\n' | wc -l

Proper pure way

Simply counting columns in file

Under bash, you could simply:

IFS=\| read -ra headline <stores.dat
echo ${#headline[@]}
4

A lot quicker as without forks, and reusable as $headline hold the full head line. You could, for sample:

printf " - %s\n" "${headline[@]}"
- sid
- storeNo
- latitude
- longitude

Nota This syntax will drive correctly spaces and others characters in column names.

Alternative: strong binary checking for max columns on each rows

What if some row do contain some extra columns?

This command will search for bigger line, counting separators:

tr -dc $'\n|' <stores.dat |wc -L
3

If there are max 3 separators, then there are 4 fields... Or if you consider:

each separator (|) is prepended by a Before and followed by an After, trimed to 1 letter by word:

tr -dc $'\n|' <stores.dat|sed 's/./b&a/g;s/ab/a/g;s/[^ab]//g'|wc -L
4

Counting columns in a CSV file

Under , you may use csv loadable plugins:

enable -f /usr/lib/bash/csv csv
IFS= read -r line <file.csv
csv -a fields <<<"$line"
echo ${#fields[@]}
4

For more infos, see How to parse a CSV file in Bash?.