Remove substring only at the end of string

I have a bunch of strings, some of them have ' rec'. I want to remove that only if those are the last 4 characters.

So in other words I have

somestring = 'this is some string rec'

and I want it to become

somestring = 'this is some string'

What is the Python way to approach this?

81970 次浏览
def rchop(s, suffix):
if suffix and s.endswith(suffix):
return s[:-len(suffix)]
return s


somestring = 'this is some string rec'
rchop(somestring, ' rec')  # returns 'this is some string'

You could use a regular expression as well:

from re import sub


str = r"this is some string rec"
regex = r"(.*)\srec$"
print sub(regex, r"\1", str)

Since you have to get len(trailing) anyway (where trailing is the string you want to remove IF it's trailing), I'd recommend avoiding the slight duplication of work that .endswith would cause in this case. Of course, the proof of the code is in the timing, so, let's do some measurement (naming the functions after the respondents proposing them):

import re


astring = 'this is some string rec'
trailing = ' rec'


def andrew(astring=astring, trailing=trailing):
regex = r'(.*)%s$' % re.escape(trailing)
return re.sub(regex, r'\1', astring)


def jack0(astring=astring, trailing=trailing):
if astring.endswith(trailing):
return astring[:-len(trailing)]
return astring


def jack1(astring=astring, trailing=trailing):
regex = r'%s$' % re.escape(trailing)
return re.sub(regex, '', astring)


def alex(astring=astring, trailing=trailing):
thelen = len(trailing)
if astring[-thelen:] == trailing:
return astring[:-thelen]
return astring

Say we've named this python file a.py and it's in the current directory; now, ...:

$ python2.6 -mtimeit -s'import a' 'a.andrew()'
100000 loops, best of 3: 19 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack0()'
1000000 loops, best of 3: 0.564 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack1()'
100000 loops, best of 3: 9.83 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.alex()'
1000000 loops, best of 3: 0.479 usec per loop

As you see, the RE-based solutions are "hopelessly outclassed" (as often happens when one "overkills" a problem -- possibly one of the reasons REs have such a bad rep in the Python community!-), though the suggestion in @Jack's comment is way better than @Andrew's original. The string-based solutions, as expected, shing, with my endswith-avoiding one having a miniscule advantage over @Jack's (being just 15% faster). So, both pure-string ideas are good (as well as both being concise and clear) -- I prefer my variant a little bit only because I am, by character, a frugal (some might say, stingy;-) person... "waste not, want not"!-)

As kind of one liner generator joined:

test = """somestring='this is some string rec'
this is some string in the end word rec
This has not the word."""
match = 'rec'
print('\n'.join((line[:-len(match)] if line.endswith(match) else line)
for line in test.splitlines()))
""" Output:
somestring='this is some string rec'
this is some string in the end word
This has not the word.
"""

If speed is not important, use regex:

import re


somestring='this is some string rec'


somestring = re.sub(' rec$', '', somestring)

Using more_itertools, we can rstrip strings that pass a predicate.

Installation

> pip install more_itertools

Code

import more_itertools as mit




iterable = "this is some string rec".split()
" ".join(mit.rstrip(iterable, pred=lambda x: x in {"rec", " "}))
# 'this is some string'


" ".join(mit.rstrip(iterable, pred=lambda x: x in {"rec", " "}))
# 'this is some string'

Here we pass all trailing items we wish to strip from the end.

See also the more_itertools docs for details.

use:

somestring.rsplit(' rec')[0]

Taking inspiration from @David Foster's answer, I would do

def _remove_suffix(text, suffix):
if text is not None and suffix is not None:
return text[:-len(suffix)] if text.endswith(suffix) else text
else:
return text

Reference: Python string slicing

Here is a one-liner version of Jack Kelly's answer along with its sibling:

def rchop(s, sub):
return s[:-len(sub)] if s.endswith(sub) else s


def lchop(s, sub):
return s[len(sub):] if s.startswith(sub) else s

def remove_trailing_string(content, trailing):
"""
Strip trailing component `trailing` from `content` if it exists.
"""
if content.endswith(trailing) and content != trailing:
return content[:-len(trailing)]
return content

Starting in Python 3.9, you can use removesuffix:

'this is some string rec'.removesuffix(' rec')
# 'this is some string'