如何“ grep”出文件的特定行范围

通常,我会使用 grep -n来查找我要查找的文件,输出结果如下:

1234: whatev 1
5555: whatev 2
6643: whatev 3

如果我想提取1234和5555之间的行,有没有工具可以做到这一点?对于静态文件,我有一个脚本,做 wc -l的文件,然后做数学分割它与尾部和头,但这不是工作得那么好的日志文件,不断被写入。

209367 次浏览

Try using sed as mentioned on http://linuxcommando.blogspot.com/2008/03/using-sed-to-extract-lines-in-text-file.html. For example use

sed '2,4!d' somefile.txt

to print from the second line to the fourth line of somefile.txt. (And don't forget to check http://www.grymoire.com/Unix/Sed.html, sed is a wonderful tool.)

Put this in a file and make it executable:

#!/usr/bin/env bash
start=`grep -n $1 < $3 | head -n1 | cut -d: -f1; exit ${PIPESTATUS[0]}`
if [ ${PIPESTATUS[0]} -ne 0 ]; then
echo "couldn't find start pattern!" 1>&2
exit 1
fi
stop=`tail -n +$start < $3 | grep -n $2 | head -n1 | cut -d: -f1; exit ${PIPESTATUS[1]}`
if [ ${PIPESTATUS[0]} -ne 0 ]; then
echo "couldn't find end pattern!" 1>&2
exit 1
fi


stop=$(( $stop + $start - 1))


sed "$start,$stop!d" < $3

Execute the file with arguments (NOTE that the script does not handle spaces in arguments!):

  1. Starting grep pattern
  2. Stopping grep pattern
  3. File path

To use with your example, use arguments: 1234 5555 myfile.txt

Includes lines with starting and stopping pattern.

The following command will do what you asked for "extract the lines between 1234 and 5555" in someFile.

sed -n '1234,5555p' someFile

If you want lines instead of line ranges, you can do it with perl: eg. if you want to get line 1, 3 and 5 from a file, say /etc/passwd:

perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd

If I understand correctly, you want to find a pattern between two line numbers. The awk one-liner could be

awk '/whatev/ && NR >= 1234 && NR <= 5555' file

You don't need to run grep followed by sed.

Perl one-liner:

perl -ne 'if (/whatev/ && $. >= 1234 && $. <= 5555) {print}' file

Line numbers are OK if you can guarantee the position of what you want. Over the years, my favorite flavor of this has been something like this:

sed "/First Line of Text/,/Last Line of Text/d" filename

which deletes all lines from the first matched line to the last match, including those lines.

Use sed -n with "p" instead of "d" to print those lines instead. Way more useful for me, as I usually don't know where those lines are.

If I want to then just extract the lines between 1234 and 5555, is there a tool to do that?

There is also ugrep, a GNU/BSD grep compatible tool but one that offers a -K option (or --range) with a range of line numbers to do just that:

ugrep -K1234,5555 -n '' somefile.log

You can use the usual GNU/BSD grep options and regex patterns (but it also offers a lot more such as -K.)