和 Findall 一起抓捕小组？

如果执行 findall(r'regex(with)capturing.goes.here')，如何访问捕获的组？我知道我可以通过 finditer做到这一点，但是我不想迭代。

python
regex

99033 次浏览

小开

Use groups freely. The matches will be returned as a list of group-tuples:

>>> re.findall('(1(23))45', '12345')
[('123', '23')]

If you want the full match to be included, just enclose the entire regex in a group:

>>> re.findall('(1(23)45)', '12345')
[('12345', '23')]

小开

最佳答案

findall just returns the captured groups:

>>> re.findall('abc(de)fg(123)', 'abcdefg123 and again abcdefg123')
[('de', '123'), ('de', '123')]

Relevant doc excerpt:

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

小开

Several ways are possible:

>>> import re
>>> r = re.compile(r"'(\d+)'")
>>> result = r.findall("'1', '2', '345'")
>>> result
['1', '2', '345']
>>> result[0]
'1'
>>> for item in result:
...     print(item)
...
1
2
345
>>>

小开

import re
string = 'Perotto, Pier Giorgio'
names = re.findall(r'''
(?P<first>[-\w ]+),\s #first name
(?P<last> [-\w ]+) #last name
''',string, re.X|re.M)


print(names)

returns

[('Perotto', 'Pier Giorgio')]

re.M would make sense if your string is multiline. Also you need VERBOSE (equal to re.X) mode in the regex I've written because it is using '''