Java正则表达式匹配计数

小编典典

Java正则表达式匹配计数

java

假设我有一个文件，该文件包含以下内容：

HelloxxxHelloxxxHello

我编译一个模式以查找“ Hello”

Pattern pattern = Pattern.compile("Hello");

然后，我使用inputstream读取文件并将其转换为String，以便可以对其进行正则表达式处理。

匹配器在文件中找到匹配项后，就会进行指示，但不会告诉我找到了多少个匹配项。只是它在字符串中找到了一个匹配项。

因此，由于字符串相对较短，并且我正在使用的缓冲区为200个字节，因此它应该找到三个匹配项。但是，它只是简单地说“比赛”，而没有提供给我多少比赛的计数。

计算字符串中发生的匹配数的最简单方法是什么。我已经尝试过各种for循环并使用matcher.groupCount（），但是速度却很快。

阅读 760

2020-03-15

共1个答案

小编典典

matcher.find()找不到所有匹配项，仅找到下一个匹配项。

你必须执行以下操作：

int count = 0;
while (matcher.find())
    count++;

顺便说一句，matcher.groupCount()是完全不同的东西。

完整的例子：

import java.util.regex.*;

class Test {
    public static void main(String[] args) {
        String hello = "HelloxxxHelloxxxHello";
        Pattern pattern = Pattern.compile("Hello");
        Matcher matcher = pattern.matcher(hello);

        int count = 0;
        while (matcher.find())
            count++;

        System.out.println(count);    // prints 3
    }
}

Handling overlapping matches

当计算上述片段aa中aaaa的时，将为你提供2。

aaaa
aa
  aa

要获得3个匹配项，即此行为：

aaaa
aa
 aa
  aa

你必须在索引处搜索匹配项，<start of last match> + 1如下所示：

String hello = "aaaa";
Pattern pattern = Pattern.compile("aa");
Matcher matcher = pattern.matcher(hello);

int count = 0;
int i = 0;
while (matcher.find(i)) {
    count++;
    i = matcher.start() + 1;
}

System.out.println(count);    // prints 3

2020-03-15