如何提取一个子字符串使用正则表达式

小开

在javascript中:

mydata.match(/'([^']+)'/)[1]

实际的regexp是:/'([^']+)'/

如果你使用非贪婪修饰符(另一篇文章)，它是这样的:

mydata.match(/'(.*?)'/)[1]

它更干净。

小开

最佳答案

假设你想要单引号之间的部分，使用这个带有Matcher的正则表达式:

"'(.*?)'"

例子:

String mydata = "some string with 'the data i want' inside";
Pattern pattern = Pattern.compile("'(.*?)'");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find())
{
System.out.println(matcher.group(1));
}

结果:

the data i want

小开

import java.util.regex.Matcher;
import java.util.regex.Pattern;


public class Test {
public static void main(String[] args) {
Pattern pattern = Pattern.compile(".*'([^']*)'.*");
String mydata = "some string with 'the data i want' inside";


Matcher matcher = pattern.matcher(mydata);
if(matcher.matches()) {
System.out.println(matcher.group(1));
}


}
}

小开

因为你还勾选了Scala，这是一个没有regex的解决方案，可以轻松处理多个带引号的字符串:

val text = "some string with 'the data i want' inside 'and even more data'"
text.split("'").zipWithIndex.filter(_._2 % 2 != 0).map(_._1)


res: Array[java.lang.String] = Array(the data i want, and even more data)

小开

在Scala中,

val ticks = "'([^']*)'".r


ticks findFirstIn mydata match {
case Some(ticks(inside)) => println(inside)
case _ => println("nothing")
}


for (ticks(inside) <- ticks findAllIn mydata) println(inside) // multiple matches


val Some(ticks(inside)) = ticks findFirstIn mydata // may throw exception


val ticks = ".*'([^']*)'.*".r
val ticks(inside) = mydata // safe, shorter, only gets the first set of ticks

小开

你不需要正则表达式。

将apache commons lang添加到你的项目(http://commons.apache.org/proper/commons-lang/)，然后使用:

String dataYouWant = StringUtils.substringBetween(mydata, "'");

小开

这里有一个简单的语句:

String target = myData.replaceAll("[^']*(?:'(.*?)')?.*", "$1");

通过将匹配组设置为可选，还可以通过在这种情况下返回空白来满足找不到引号的需求。

看到现场演示。

小开

String dataIWant = mydata.split("'")[1];

看到现场演示

小开

String dataIWant = mydata.replaceFirst(".*'(.*?)'.*", "$1");

小开

Apache Commons Lang为java提供了大量的辅助工具。lang API，最著名的是字符串操作方法。在您的示例中，开始子字符串和结束子字符串是相同的，因此只需调用以下函数

StringUtils.substringBetween(String str, String tag)
获取嵌套在中的两个相同实例之间的String 字符串< /强>。< / p >

如果开始子字符串和结束子字符串不同，则使用以下重载方法。

StringUtils.substringBetween(String str, String open, String close)

获取嵌套在两个字符串之间的字符串。

如果你想要匹配子字符串的所有实例，那么使用，

StringUtils.substringsBetween(String str, String open, String close)

在字符串中搜索由开始和结束标记分隔的子字符串， 返回数组中所有匹配的子字符串。

对于所讨论的示例，获取匹配子字符串的所有实例

String[] results = StringUtils.substringsBetween(mydata, "'", "'");

小开

你可以用这个我使用while循环存储所有匹配子字符串在数组中，如果你使用
< p > <代码>如果(matcher.find ()) ｛ System.out.println (matcher.group (1)); 代码}< / > < / p >
你会得到匹配子串所以你可以用这个来获取所有匹配子串

Matcher m = Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+").matcher(text); // Matcher mat = pattern.matcher(text); ArrayList<String>matchesEmail = new ArrayList<>(); while (m.find()){ String s = m.group(); if(!matchesEmail.contains(s)) matchesEmail.add(s); } Log.d(TAG, "emails: "+matchesEmail);

小开

在你的pom.xml上添加apache.commons依赖项

<dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-io</artifactId> <version>1.3.2</version> </dependency>

下面的代码可以工作。

StringUtils.substringBetween(String mydata, String "'", String "'")

小开

这群人对我不起作用。我使用group(0)来查找url版本。

Pattern urlVersionPattern = Pattern.compile("\\/v[0-9][a-z]{0,1}\\/"); Matcher m = urlVersionPattern.matcher(url); if (m.find()) { return StringUtils.substringBetween(m.group(0), "/", "/"); } return "v0";

小开

从Java 9开始

在这个版本中，你可以使用一个不带参数的新方法Matcher::results，它能够轻松地返回Stream<MatchResult>，其中MatchResult表示匹配操作的结果，并提供读取匹配的组和更多信息(这个类从Java 1.5开始存在)。

String string = "Some string with 'the data I want' inside and 'another data I want'."; Pattern pattern = Pattern.compile("'(.*?)'"); pattern.matcher(string) .results() // Stream<MatchResult> .map(mr -> mr.group(1)) // Stream<String> - the 1st group of each result .forEach(System.out::println); // print them out (or process in other way...)

上面的代码片段导致:

the data I want another data I want

与过程性if (matcher.find())和while (matcher.find())检查和处理相比，最大的优势在于当一个或多个结果可用时，使用起来更容易。