返回第一个匹配的 Ruby 正则表达式

我正在寻找一种方法来执行正则表达式匹配的字符串在 Ruby 中，并有它短路的第一个匹配。

我正在处理的字符串很长，按照标准方法(match方法)将处理整个过程，收集每个匹配，并返回包含所有匹配的 MatchData 对象。

match = string.match(/regex/)[0].to_s

86240 次浏览

小开

最佳答案

You could try String#[] (as in variableName[/regular expression/]).

This is an example output from IRB:

names = "erik kalle johan anders erik kalle johan anders"
# => "erik kalle johan anders erik kalle johan anders"
names[/kalle/]
# => "kalle"

小开

If only an existence of a match is important, you can go with

/regexp/ =~ "string"

Either way, match should only return the first hit, while scan searches throughout entire string. Therefore if

matchData = "string string".match(/string/)
matchData[0]    # => "string"
matchData[1]    # => nil - it's the first capture group not a second match

小开

A Regular Expression (regex) is nothing but a finite state machine (FSM).

An FSM attempts to answer the question "Is this state possible or not?"

It keeps attempting to make a pattern match until a match is found (success), or until all paths are explored and no match was found (failure).

On success, the question "Is this state possible or not?" has been answered with a "yes". Hence no further matching is necessary and the regex returns.

See this and this for more on this.

Further: here is an interesting example to demonstrate how regex works. Here, a regex is used to detect if a give number is prime. This example is in perl, but it can as well be written in ruby.

小开

You can use []: (which is like match)

"foo+account2@gmail.com"[/\+([^@]+)/, 1] # matches capture group 1, i.e. what is inside ()
# => "account2"
"foo+account2@gmail.com"[/\+([^@]+)/]    # matches capture group 0, i.e. the whole match
# => "+account2"

小开

I am not yet sure whether this feature is awesome or just totally crazy, but your regex can define local variables.

/\$(?<dollars>\d+)\.(?<cents>\d+)/ =~ "$3.67" #=> 0
dollars #=> "3"

(Taken from http://ruby-doc.org/core-2.1.1/Regexp.html).