如何从 Select-String 获取捕获的组?

我正在尝试使用 Powershell (版本4)从 Windows 上的一组文件中提取文本:

PS > Select-String -AllMatches -Pattern <mypattern-with(capture)> -Path file.jsp | Format-Table

到目前为止,一切都很好。这就给出了一组不错的 MatchInfo对象:

IgnoreCase                    LineNumber Line                          Filename                      Pattern                       Matches
----------                    ---------- ----                          --------                      -------                       -------
True                            30   ...                           file.jsp                      ...                           {...}

接下来,我看到捕捉到的内容在 match 成员中,所以我把它们取出来:

PS > Select-String -AllMatches -Pattern <mypattern-with(capture)> -Path file.jsp | ForEach-Object -MemberName Matches | Format-Table

结果是:

Groups        Success Captures                 Index     Length Value
------        ------- --------                 -----     ------ -----
{...}         True    {...}                    49        47     ...

或作为与 | Format-List的列表:

Groups   : {matched text, captured group}
Success  : True
Captures : {matched text}
Index    : 39
Length   : 33
Value    : matched text

这就是我停下来的地方,我不知道如何进一步获得 被俘集团元素的列表。

我已经尝试添加另一个 | ForEach-Object -MemberName Groups,但它似乎返回相同的上述。

我得到的最接近的是 | Select-Object -Property Groups,它确实给了我我所期望的(一个集合列表) :

Groups
------
{matched text, captured group}
{matched text, captured group}
...

但是我无法从每一个中提取出 被俘集团,我尝试用 | Select-Object -Index 1,我只能得到其中的一组。


更新: 可能的解决方案

似乎通过添加 | ForEach-Object { $_.Groups.Groups[1].Value }我得到了我想要的结果,但是我不明白为什么——所以我不能确定在将这个方法扩展到整个文件集时能否得到正确的结果。

为什么有用?

值得注意的是,这个 | ForEach-Object { $_.Groups[1].Value }(即没有第二个 .Groups)给出了相同的结果。

我想补充的是,在进一步的尝试中,似乎该命令可以通过删除管道 | Select-Object -Property Groups来缩短。

79279 次浏览

Have a look at the following

$a = "http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$'

$a is now a MatchInfo ($a.gettype()) it contain a Matches property.

PS ps:\> $a.Matches
Groups   : {http://192.168.3.114:8080/compierews/, 192.168.3.114, compierews}
Success  : True
Captures : {http://192.168.3.114:8080/compierews/}
Index    : 0
Length   : 37
Value    : http://192.168.3.114:8080/compierews/

in the groups member you'll find what you are looking for so you can write :

"http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$'  | % {"IP is $($_.matches.groups[1]) and path is $($_.matches.groups[2])"}


IP is 192.168.3.114 and path is compierews

This script will grab a regex's specified capture group from a file's content and output its matches to console.


$file is the file you want to load
$cg is capture group you want to grab
$regex is the regular expression pattern



Example file and its content to load:

C:\some\file.txt

This is the especially special text in the file.



Example Use: .\get_regex_capture.ps1 -file "C:\some\file.txt" -cg 1 -regex '\b(special\W\w+)'

Output: special text


get_regex_capture.ps1

Param(
$file=$file,
[int]$cg=[int]$cg,
$regex=$regex
)
[int]$capture_group = $cg
$file_content = [string]::Join("`r`n", (Get-Content -Raw "$file"));
Select-String -InputObject $file_content -Pattern $regex -AllMatches | % { $_.Matches.Captures } | % { echo $_.Groups[$capture_group].Value }

This worked for my situation.

Using the file: test.txt

// autogenerated by script
char VERSION[21] = "ABCDEFGHIJKLMNOPQRST";
char NUMBER[16] = "123456789012345";

Get the NUMBER and VERSION from the file.

PS C:\> Select-String -Path test.txt -Pattern 'VERSION\[\d+\]\s=\s\"(.*)\"' | %{$_.Matches.Groups[
1].value}


ABCDEFGHIJKLMNOPQRST


PS C:\> Select-String -Path test.txt -Pattern 'NUMBER\[\d+\]\s=\s\"(.*)\"' | %{$_.Matches.Groups[1
].value}


123456789012345


According to the powershell docs on Regular Expressions > Groups, Captures, and Substitutions:

When using the -match operator, powershell will create an automatic variable named $Matches

PS> "The last logged on user was CONTOSO\jsmith" -match "(.+was )(.+)"

The value returned from this expression is just true|false, but PS will add the $Matches hashtable

So if you output $Matches, you'll get all capture groups:

PS> $Matches


Name     Value
----     -----
2        CONTOSO\jsmith
1        The last logged on user was
0        The last logged on user was CONTOSO\jsmith

And you can access each capture group individually with dot notation like this:

PS> "The last logged on user was CONTOSO\jsmith" -match "(.+was )(.+)"
PS> $Matches.2
CONTOSO\jsmith

Additional Resources:

Late answer, but to loop multiple matches and groups I use:

$pattern = "Login:\s*([^\s]+)\s*Password:\s*([^\s]+)\s*"
$matches = [regex]::Matches($input_string, $pattern)


foreach ($match in $matches)
{
Write-Host  $match.Groups[1].Value
Write-Host  $match.Groups[2].Value
}