使用 git diff,如何获得添加和修改的行号?

假设我有一个文本文件

alex
bob
matrix
will be removed
git repo

我已经更新了

alex
new line here
another new line
bob
matrix
git

在这里,我添加了行号(2,3)和更新的行号(6)

如何使用 git diff 或其他 git 命令获取这些行号信息?

62489 次浏览

git diff --stat will show you the output you get when committing stuff which is the one you are referring to I guess.

git diff --stat

For showing exactly the line numbers that has been changed you can use

git blame -p <file> | grep "Not Committed Yet"

And the line changed will be the last number before the ending parenthesis in the result. Not a clean solution though :(

Configure an external diff tool which will show you the line numbers. For example, this is what I have in my git global config:

diff.guitool=kdiff3
difftool.kdiff3.path=c:/Program Files (x86)/KDiff3/kdiff3.exe
difftool.kdiff3.cmd="c:/Program Files (x86)/KDiff3/kdiff3.exe" "$LOCAL" "$REMOTE"

See this answer for more details: https://stackoverflow.com/q/949242/526535

Not exactly what you were asking for, but git blame TEXTFILE may help.

Here's a bash function to calculate the resulting line numbers from a diff:

diff-lines() {
local path=
local line=
while read; do
esc=$'\033'
if [[ $REPLY =~ ---\ (a/)?.* ]]; then
continue
elif [[ $REPLY =~ \+\+\+\ (b/)?([^[:blank:]$esc]+).* ]]; then
path=${BASH_REMATCH[2]}
elif [[ $REPLY =~ @@\ -[0-9]+(,[0-9]+)?\ \+([0-9]+)(,[0-9]+)?\ @@.* ]]; then
line=${BASH_REMATCH[2]}
elif [[ $REPLY =~ ^($esc\[[0-9;]*m)*([\ +-]) ]]; then
echo "$path:$line:$REPLY"
if [[ ${BASH_REMATCH[2]} != - ]]; then
((line++))
fi
fi
done
}

It can produce output such as:

$ git diff | diff-lines
http-fetch.c:1: #include "cache.h"
http-fetch.c:2: #include "walker.h"
http-fetch.c:3:
http-fetch.c:4:-int cmd_http_fetch(int argc, const char **argv, const char *prefix)
http-fetch.c:4:+int main(int argc, const char **argv)
http-fetch.c:5: {
http-fetch.c:6:+       const char *prefix;
http-fetch.c:7:        struct walker *walker;
http-fetch.c:8:        int commits_on_stdin = 0;
http-fetch.c:9:        int commits;
http-fetch.c:19:        int get_verbosely = 0;
http-fetch.c:20:        int get_recover = 0;
http-fetch.c:21:
http-fetch.c:22:+       prefix = setup_git_directory();
http-fetch.c:23:+
http-fetch.c:24:        git_config(git_default_config, NULL);
http-fetch.c:25:
http-fetch.c:26:        while (arg < argc && argv[arg][0] == '-') {
fetch.h:1: #include "config.h"
fetch.h:2: #include "http.h"
fetch.h:3:
fetch.h:4:-int cmd_http_fetch(int argc, const char **argv, const char *prefix);
fetch.h:4:+int main(int argc, const char **argv);
fetch.h:5:
fetch.h:6: void start_fetch(const char* uri);
fetch.h:7: bool fetch_succeeded(int status_code);

from a diff like this:

$ git diff
diff --git a/builtin-http-fetch.c b/http-fetch.c
similarity index 95%
rename from builtin-http-fetch.c
rename to http-fetch.c
index f3e63d7..e8f44ba 100644
--- a/builtin-http-fetch.c
+++ b/http-fetch.c
@@ -1,8 +1,9 @@
#include "cache.h"
#include "walker.h"
 

-int cmd_http_fetch(int argc, const char **argv, const char *prefix)
+int main(int argc, const char **argv)
{
+       const char *prefix;
struct walker *walker;
int commits_on_stdin = 0;
int commits;
@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, const char *prefix)
int get_verbosely = 0;
int get_recover = 0;
 

+       prefix = setup_git_directory();
+
git_config(git_default_config, NULL);
 

while (arg < argc && argv[arg][0] == '-') {
diff --git a/fetch.h b/fetch.h
index 5fd3e65..d43e0ca 100644
--- a/fetch.h
+++ b/fetch.h
@@ -1,7 +1,7 @@
#include "config.h"
#include "http.h"
 

-int cmd_http_fetch(int argc, const char **argv, const char *prefix);
+int main(int argc, const char **argv);
 

void start_fetch(const char* uri);
bool fetch_succeeded(int status_code);

If you only want to show added/removed/modified lines, and not the surrounding context, you can pass -U0 to git diff:

$ git diff -U0 | diff-lines
http-fetch.c:4:-int cmd_http_fetch(int argc, const char **argv, const char *prefix)
http-fetch.c:4:+int main(int argc, const char **argv)
http-fetch.c:6:+       const char *prefix;
http-fetch.c:22:+       prefix = setup_git_directory();
http-fetch.c:23:+
fetch.h:4:-int cmd_http_fetch(int argc, const char **argv, const char *prefix);
fetch.h:4:+int main(int argc, const char **argv);

It's robust against ANSI color codes, so you can pass --color=always to git diff to get the usual color coding for added/removed lines.

The output can be easily grepped:

$ git diff -U0 | diff-lines | grep 'main'
http-fetch.c:4:+int main(int argc, const char **argv)
fetch.h:4:+int main(int argc, const char **argv);

In your case git diff -U0 would give:

$ git diff -U0 | diff-lines
test.txt:2:+new line here
test.txt:3:+another new line
test.txt:6:-will be removed
test.txt:6:-git repo
test.txt:6:+git

If you just want the line numbers, change the echo "$path:$line:$REPLY" to just echo "$line" and pipe the output through uniq.

Here's a bash function I cobbled together:

echo ${f}:
for n in $(git --no-pager blame --line-porcelain $1 |
awk '/author Not Committed Yet/{if (a && a !~ /author Not Committed Yet/) print a} {a=$0}' |
awk '{print $3}') ; do
if (( prev_line > -1 )) ; then
if (( "$n" > (prev_line + 1) )) ; then
if (( (prev_line - range_start) > 1 )) ; then
echo -n "$range_start-$prev_line,"
else
echo -n "$range_start,$prev_line,"
fi
range_start=$n
fi
else
range_start=$n
fi
prev_line=$n
done
if (( "$range_start" != "$prev_line" )) ; then
echo "$range_start-$prev_line"
else
echo "$range_start"
fi

And it ends up looking like this:

views.py:
403,404,533-538,546-548,550-552,554-559,565-567,580-582

You can use git diff coupled with shortstat parameter to just show the no of lines changed.

For the no of lines changed (in a file that's already in the repo) since your last commit

git diff HEAD --shortstat

It'll output something similar to

1 file changed, 4 insertions(+)

I had this same problem so I wrote a gawk script that changes the output of git diff to prepend the line number for each line. I find it useful sometimes when I need to diff working tree, although it's not limited to that. Maybe it is useful to someone here?

$ git diff HEAD~1 |showlinenum.awk
diff --git a/doc.txt b/doc.txt
index fae6176..6ca8c26 100644
--- a/doc.txt
+++ b/doc.txt
@@ -1,3 +1,3 @@
1: red
2: blue
:-green
3:+yellow

You can download it from here:
https://github.com/jay/showlinenum

This is probably a fairly accurate count of changed lines:

git diff --word-diff <commit> |egrep '(?:\[-)|(?:\{\+)' |wc -l

Also, here is a solution for line numbers in your diff: https://github.com/jay/showlinenum

I use the --unified=0 option of git diff.

For example, git diff --unified=0 commit1 commit2 outputs the diff:

*enter image description here*

Because of the --unified=0 option, the diff output shows 0 context lines; in other words, it shows exactly the changed lines.

Now, you can identify the lines that start with '@@', and parse it based on the pattern:

@@ -startline1,count1 +startline2,count2 @@

Back to the above example, for the file WildcardBinding.java, start from line 910, 0 lines are deleted. Start from line 911, 4 lines are added.

Line numbers of all uncommitted lines (added/modified):

git blame <file> | grep -n '^0\{8\} ' | cut -f1 -d:

Example output:

1
2
8
12
13
14

I was looking for a way to output only the lines changed for each file using git diff. My idea was to feed this output to a linter for type checking. This is what helped me

Here's some Python copypasta to get the line numbers for modified / removed lines, in case you came across this question looking for that.

It should be fairly easy to modify it into something that gets the modified and added line numbers as well.

I've only tested on Windows, but it should be cross platform as well.

import re
import subprocess


def main(file1: str, file2: str):
diff = get_git_diff(file1, file2)
print(edited_lines(diff))


def edited_lines(git_diff: str):
ans = []
diff_lines = git_diff.split("\n")
found_first = False
# adjust for added lines
adjust = 0
# how many lines since the start
count = 0
for line in diff_lines:
if found_first:
count += 1
if line.startswith('-'):
# minus one because count is 1 when we're looking at the start line
ans.append(start + count - adjust - 1)
continue


if line.startswith('+'):
adjust += 1
continue


# get the start line
match = re.fullmatch(r'@@ \-(\d+),\d+ \+\d+,\d+ @@', line)
if match:
start = int(match.group(1))
count = 0
adjust = 0
found_first = True


return ans




def get_git_diff(file1: str, file2: str):
try:
diff_process: subprocess.CompletedProcess = subprocess.run(['git', 'diff', '--no-index', '-u', file1, file2], shell=True, check=True, stdout=subprocess.PIPE)
ans = diff_process.stdout
# git may exit with 1 even though it worked
except subprocess.CalledProcessError as e:
if e.stdout and e.stderr is None:
ans = e.stdout
else:
raise


# remove carriage at the end of lines from Windows
ans = ans.decode()
ans.replace('\r', '')
return ans




if __name__ == "__main__":
main("file1.txt", "file2.txt")

Perhaps this, credit goes to Jakub Bochenski - Git diff with line numbers (Git log with line numbers)

git diff --unified=0 | grep -Po '^\+\+\+ ./\K.*|^@@ -[0-9]+(,[0-9]+)? \+\K[0-9]+(,[0-9]+)?(?= @@)'