正则表达式可选组

我正在使用这个 regex:

((?:[a-z][a-z]+))_(\d+)_((?:[a-z][a-z]+)\d+)_(\d{13})

来匹配这样的字符串:

SH_6208069141055_BC000388_20110412101855

分成四组:

SH
6208069141055
BC000388
20110412101855

问: 如何使第一个组成为可选的,从而使得结果组成为空字符串?
如果可能的话,我想在每个案例中分成4组。

这种情况下的输入字符串: (第一组后面没有下划线)

6208069141055_BC000388_20110412101855
163931 次浏览

You can easily simplify your regex to be this:

(?:([a-z]{2,})_)?(\d+)_([a-z]{2,}\d+)_(\d+)$
^              ^^
|--------------||
| first group  ||- quantifier for 0 or 1 time (essentially making it optional)

I'm not sure whether the input string without the first group will have the underscore or not, but you can use the above regex if it's the whole string.

regex101 demo

As you can see, the matched group 1 in the second match is empty and starts at matched group 2.

Making a non-capturing, zero to more matching group, you must append ?.

(?: ..... )?
^          ^____ optional
|____ group