删除路径中的第一个文件夹

我有一条看起来像是

/First/Second/Third/Fourth/Fifth

我想删除它的 First,从而获得

Second/Third/Fourth/Fifth

我能想到的唯一想法是递归地使用 os.path.split,但这似乎不是最佳的。还有更好的解决办法吗?

59990 次浏览

There really is nothing in the os.path module to do this. Every so often, someone suggests creating a splitall function that returns a list (or iterator) of all of the components, but it never gained enough traction.

Partly this is because every time anyone ever suggested adding new functionality to os.path, it re-ignited the long-standing dissatisfaction with the general design of the library, leading to someone proposing a new, more OO-like, API for paths to deprecated the os, clunky API. In 3.4, that finally happened, with pathlib. And it's already got functionality that wasn't in os.path. So:

>>> import pathlib
>>> p = pathlib.Path('/First/Second/Third/Fourth/Fifth')
>>> p.parts[2:]
('Third', 'Fourth', 'Fifth')
>>> pathlib.Path(*p.parts[2:])
PosixPath('Second/Third/Fourth/Fifth')

Or… are you sure you really want to remove the first component, rather than do this?

>>> p.relative_to(*p.parts[:2])
PosixPath('Second/Third/Fourth/Fifth')

If you need to do this in 2.6-2.7 or 3.2-3.3, there's a backport of pathlib.

Of course, you can use string manipulation, as long as you're careful to normalize the path and use os.path.sep, and to make sure you handle the fiddly details with non-absolute paths or with systems with drive letters, and…

Or you can just wrap up your recursive os.path.split. What exactly is "non-optimal" about it, once you wrap it up? It may be a bit slower, but we're talking nanoseconds here, many orders of magnitude faster than even calling stat on a file. It will have recursion-depth problems if you have a filesystem that's 1000 directories deep, but have you ever seen one? (If so, you can always turn it into a loop…) It takes a few minutes to wrap it up and write good unit tests, but that's something you just do once and never worry about again. So, honestly, if you don't want to use pathlib, that's what I'd do.

A simple approach

a = '/First/Second/Third/Fourth/Fifth'
"/".join(a.strip("/").split('/')[1:])

output:

Second/Third/Fourth/Fifth

In this above code i have split the string. then joined leaving 1st element

Using itertools.dropwhile:

>>> a = '/First/Second/Third/Fourth/Fifth'
>>> "".join(list(itertools.dropwhile(str.isalnum, a.strip("/"))[1:])
'Second/Third/Fourth/Fifth'

A bit like another answer, taking advantage of os.path :

os.path.join(*(x.split(os.path.sep)[2:]))

... assuming your string starts with a separator.

I was looking if there was a native way to do it, but it seems it doesn't.

I know this topic is old, but this is what I did to get me to the best solution: There was two basically two approaches: using split() and using len(). Both had to use slicing.

1) Using split()

import time


start_time = time.time()


path = "/folder1/folder2/folder3/file.zip"
for i in xrange(500000):
new_path = "/" + "/".join(path.split("/")[2:])


print("--- %s seconds ---" % (time.time() - start_time))

Result: --- 0.420122861862 seconds ---

*Removing the char "/" in the line new_path = "/" + "/".... didn't improve the performance too much.

2) Using len(). This method will only work if you provide the folder if you would like to remove

import time


start_time = time.time()


path = "/folder1/folder2/folder3/file.zip"
folder = "/folder1"
for i in xrange(500000):
if path.startswith(folder):
a = path[len(folder):]


print("--- %s seconds ---" % (time.time() - start_time))

Result: --- 0.199596166611 seconds ---

*Even with that "if" to check if the path starts with the file name, it was twice as fast as the first method.

In summary: each method has a pro and con. If you are absolutely sure about the folder you want to remove use method two, otherwise I recommend to use method 1 which people here have mentioned previously.

You can try:

os.path.relpath(your_path, '/First')