My regex is matching too much. How do I make it stop?

I have this gigantic ugly string:

J0000000: Transaction A0001401 started on 8/22/2008 9:49:29 AM
J0000010: Project name: E:\foo.pf
J0000011: Job name: MBiek Direct Mail Test
J0000020: Document 1 - Completed successfully

I'm trying to extract pieces from it using regex. In this case, I want to grab everything after Project Name up to the part where it says J0000011: (the 11 is going to be a different number every time).

Here's the regex I've been playing with:

Project name:\s+(.*)\s+J[0-9]{7}:

The problem is that it doesn't stop until it hits the J0000020: at the end.

How do I make the regex stop at the first occurrence of J[0-9]{7}?

71045 次浏览

.*后面加上“ ?”,使 .*不贪婪:

Project name:\s+(.*?)\s+J[0-9]{7}:

Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.

但是,可以考虑使用负字符类:

Project name:\s+(\S*)\s+J[0-9]{7}:

\S的意思是“除了空格之外的所有东西,这正是你想要的。

我还建议您使用“ Expresso”来试验正则表达式——这是一个用于正则表达式编辑和测试的很棒(并且免费)的实用工具。

它的一个好处是,它的 UI 暴露了许多正则表达式功能,这些功能可能是那些没有正则表达式使用经验的人所不熟悉的,这样他们就可以很容易地学习这些新概念。

For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.

可在其网站下载: Http://www.ultrapico.com/expresso.htm

快速下载: Http://www.ultrapico.com/expressodownload.htm

".*"是个贪婪的选择者。使用 ".*?"可以使它不贪婪当使用后一种结构时,正则表达式引擎将在每一步将文本匹配到 "."中,并尝试匹配 ".*?"之后的 make。这意味着,例如,如果在 ".*?"之后没有任何东西,那么它就不匹配任何东西。

这是我用过的。s包含原始字符串。这个密码是。NET 特定的,但大多数风格的正则表达式将有类似的东西。

string m = Regex.Match(s, @"Project name: (?<name>.*?) J\d+").Groups["name"].Value;

(项目名称: s + [ A-Z ] : (? : w +) + . [ a-zA-Z ] + s + J [0-9]{7})(? = :)

这对你有用。

添加(? : w +) + . [ a-zA-Z ] + 将比. * 限制更多