不包含特定字符串的正则表达式

我有这样的东西

Aabbabcaabda

为了选择由 包装的最小组,我有这个 /a([^a]*)a/,它工作得很好

但是我对 包装的组有问题,在那里我需要类似于 /aa([^aa]*)aa/不起作用,我不能像 /aa([^a]*)aa/那样使用第一个,因为它会在 第一次出现时结束,我不想这样。

一般来说,有没有办法,如何说 不包含字符串在同样的方式 我可以说 不包含字符 [^a]

简单地说,我需要 < em > aa后面的任何字符序列 ,然后以 Aa 结束

202801 次浏览
/aa([^a]|a[^a])*aa/

In general it's a pain to write a regular expression not containing a particular string. We had to do this for models of computation - you take an NFA, which is easy enough to define, and then reduce it to a regular expression. The expression for things not containing "cat" was about 80 characters long.

Edit: I just finished and yes, it's:

aa([^a] | a[^a])aa

Here is a very brief tutorial. I found some great ones before, but I can't see them anymore.

All you need is a reluctant quantifier:

regex: /aa.*?aa/


aabbabcaabda   => aabbabcaa


aaaaaabda      => aaaa


aabbabcaabda   => aabbabcaa


aababaaaabdaa  => aababaa, aabdaa

You could use negative lookahead, too, but in this case it's just a more verbose way accomplish the same thing. Also, it's a little trickier than gpojd made it out to be. The lookahead has to be applied at each position before the dot is allowed to consume the next character.

/aa(?:(?!aa).)*aa/

As for the approach suggested by Claudiu and finnw, it'll work okay when the sentinel string is only two characters long, but (as Claudiu acknowledged) it's too unwieldy for longer strings.

By the power of Google I found a blogpost from 2007 which gives the following regex that matches string which don't contains a certain substring:

^((?!my string).)*$

It works as follows: it looks for zero or more (*) characters (.) which do not begin (?! - negative lookahead) your string and it stipulates that the entire string must be made up of such characters (by using the ^ and $ anchors). Or to put it an other way:

The entire string must be made up of characters which do not begin a given string, which means that the string doesn't contain the given substring.

".*[^(\\.inc)]\\.ftl$"

In Java this will find all files ending in ".ftl" but not ending in ".inc.ftl", which is exactly what I wanted.

I the following code I had to replace add a GET-parameter to all references to JS-files EXCEPT one.

<link rel="stylesheet" type="text/css" href="/login/css/ABC.css" />
<script type="text/javascript" language="javascript" src="/localization/DEF.js"></script>
<script type="text/javascript" language="javascript" src="/login/jslib/GHI.js"></script>
<script type="text/javascript" language="javascript" src="/login/jslib/md5.js"></script>
sendRequest('/application/srvc/EXCEPTION.js', handleChallengeResponse, null);
sendRequest('/application/srvc/EXCEPTION.js",handleChallengeResponse, null);

This is the Matcher used:

(?<!EXCEPTION)(\.js)

What that does is look for all occurences of ".js" and if they are preceeded by the "EXCEPTION" string, discard that result from the result array. That's called negative lookbehind. Since I spent a day on finding out how to do this I thought I should share.

I'm not sure it's a standard construct, but I think you should have a look on "negative lookahead" (which writes : "?!", without the quotes). It's far easier than all answers in this thread, including the accepted one.

Example : Regex : "^(?!123)[0-9]*\w" Captures any string beginning by digits followed by letters, UNLESS if "these digits" are 123.

http://msdn.microsoft.com/en-us/library/az24scfc%28v=vs.110%29.aspx#grouping_constructs (microsoft page, but quite comprehensive) for lookahead / lookbehind

PS : it works well for me (.Net). But if I'm wrong on something, please let us know. I find this construct very simple and effective, so I'm surprised of the accepted answer.