如何获得所有 Subversion 提交作者用户名的列表?

我正在寻找一种有效的方法来获取 SVN 存储库作为一个整体的唯一提交作者列表,或者获取给定资源路径的唯一提交作者列表。我还没有找到一个 SVN 命令来专门解决这个问题(也别指望有) ,但是我希望有一个更好的方法来解决这个问题,而不是我目前在 Terminal (在 OS X 上)所尝试的方法:

svn log --quiet | grep "^r" | awk '{print $3}'


svn log --quiet --xml | grep author | sed -E "s:</?author>::g"

这两个选项中的任何一个都会为每行提供一个作者名称,但它们都需要过滤掉大量额外信息。它们也不处理同一作者名称的副本,所以对于少数作者的大量提交,有大量的冗余在线流动。通常我只想看到独特的作者用户名。(实际上,有时候 也许吧可以很方便地推断每个作者的提交计数,但是即使在这种情况下,如果将聚合数据发送到另一个作者那里会更好。)

我通常使用客户端访问,所以 svnadmin命令没有多大用处,但是如果有必要的话,如果真的有必要或者效率更高的话,我可以请存储库管理员特别帮忙。我正在使用的存储库有成千上万的提交和许多活跃用户,我不想给任何人带来不便。

53585 次浏览

To filter out duplicates, take your output and pipe through: sort | uniq. Thus:

svn log --quiet | grep "^r" | awk '{print $3}' | sort | uniq

I woud not be surprised if this is the way to do what you ask. Unix tools often expect the user to do fancy processing and analysis with other tools.

P.S. Come to think of it, you can merge the grep and awk...

svn log --quiet | awk '/^r/ {print $3}' | sort | uniq

P.P.S. Per Kevin Reid...

svn log --quiet | awk '/^r/ {print $3}' | sort -u

P3.S. Per kan, using the vertical bars instead of spaces as field separators, to properly handle names with spaces (also updated the Python examples)...

svn log --quiet | awk -F ' \\\\|' '/^r/ {print $2}' | sort -u

For more efficient, you could do a Perl one-liner. I don't know Perl that well, so I'd wind up doing it in Python:

#!/usr/bin/env python
import sys
authors = set()
for line in sys.stdin:
if line[0] == 'r':
authors.add(line.split('|')[1].strip())
for author in sorted(authors):
print(author)

Or, if you wanted counts:

#!/usr/bin/env python
from __future__ import print_function # Python 2.6/2.7
import sys
authors = {}
for line in sys.stdin:
if line[0] != 'r':
continue
author = line.split('|')[1].strip()
authors.setdefault(author, 0)
authors[author] += 1
for author in sorted(authors):
print(author, authors[author])

Then you'd run:

svn log --quiet | ./authorfilter.py

I had to do this in Windows, so I used the Windows port of Super Sed ( http://www.pement.org/sed/ ) - and replaced the AWK & GREP commands:

svn log --quiet --xml | sed -n -e "s/<\/\?author>//g" -e "/[<>]/!p" | sort | sed "$!N; /^\(.*\)\n\1$/!P; D" > USERS.txt

This uses windows "sort" that might not be present on all machines.

svn log  path-to-repo | grep '^r' | grep '|' | awk '{print $3}' | sort | uniq > committers.txt

This command has the additional grep '|' that eliminates false values. Otherwise, Random commits starting with 'r' get included and thus words from commit messages get returned.

In PowerShell, set your location to the working copy and use this command.

svn.exe log --quiet |
? { $_ -notlike '-*' } |
% { ($_ -split ' \| ')[1] } |
Sort -Unique

The output format of svn.exe log --quiet looks like this:

r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
------------------------------------------------------------------------
r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013)
------------------------------------------------------------------------
r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)

Filter out the horizontal rules with ? { $_ -notlike '-*' }.

r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)
r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013)
r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013)
r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013)
r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)

Split by ' \| ' to turn a record into an array.

$ 'r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)' -split ' \| '
r20209
tinkywinky
2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)

The second element is the name.

Make an array of each line and select the second element with % { ($_ -split ' \| ')[1] }.

tinkywinky
dispy
lala
po
tinkywinky

Return unique occurrences with Sort -Unique. This sorts the output as a side effect.

dispy
lala
po
tinkywinky

A simpler alternative:

find . -name "*cpp" -exec svn log -q {} \;|grep -v "\-\-"|cut -d "|" -f 2|sort|uniq -c|sort -n

Powershell has support for XML which eliminates the need for parsing string output.

Here's a quick script I used on a mac to get a unique list of users across multiple repositories.

#!/usr/bin/env pwsh


$repos = @(
'Common/'
'Database/'
'Integration/'
'Reporting/'
'Tools/'
'Web/'
'Webservices/'
)


foreach ($repo in $repos) {
$url = "https://svn.example.com:8443/svn/$repo"
$users += ([Xml](svn log $url --xml)).log.logentry.author | Sort-Object -Unique
}


$users | Sort-Object -Unique

One a remote repository you can use:

 svn log --quiet https://url/svn/project/ | grep "^r" | awk '{print $3}' | sort | uniq

A solution for windows 10.

  1. create a batch file printAllAuthor.bat
@echo off
for /f "tokens=3" %%a in ('svn log --quiet ^|findstr /r "^r"') do echo %%a
@echo on
  1. run bat file with sort command
printAllAuthor.bat | sort /unique >author.txt

PS:

  • The step 2 need run the batch file with right path. either set path in %PATH% or use the right OS path format.
  • The step 2 can be made into a batch file as well according to your needs.