re.compile speeds up regexs a lot if you are searching for the same thing over and over. But I just got a huge speedup by using "in" to cull out bad cases before I match. Anecdotal, I know. ~Ben
I've had the same problem. I used Jupyter's %timeit to check:
import re
sent = "a sentence for measuring a find function"
sent_list = sent.split()
print("x in sentence")
%timeit "function" in sent
print("x in token list")
%timeit "function" in sent_list
print("regex search")
%timeit bool(re.match(".*function.*", sent))
print("compiled regex search")
regex = re.compile(".*function.*")
%timeit bool(regex.match(sent))
x in sentence 61.3 ns ± 3 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
x in token list 93.3 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
regex search 772 ns ± 8.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
compiled regex search 420 ns ± 7.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Maybe someone is still interested.
The given answers seem fine but only look at a very short string.
In fact if you take a long string and the pattern you are looking for is roughly at the end then the performance changes in favor of regex!
pattern at the end of string
find: 3.41 µs ± 74.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
regex: 1.93 µs ± 23.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
in: 3.32 µs ± 74.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
pattern in front of string
find: 748 ns ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
regex: 2.03 µs ± 21.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
in: 589 ns ± 6.75 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Summary: find and in depend on string length and location of pattern in the string while regex is somehow string-length independent and faster for very long strings with the pattern at the end.
and if your regex necessarily require some word match then it is rather a better option to reduce your regex comparison with "if" "in" search. For example the following is faster then then the above two and gives the same result:
if(some_keyword.lower() in some_sentence.lower()):
if(re.search(rf"\b{re.escape(some_keyword)}\b",some_sentence)):