如何从熊猫数据框头中去除空格？

小开

最佳答案

You can give functions to the rename method. The str.strip() method should do what you want:

In [5]: df
Out[5]:
Year  Month   Value
0     1       2      3


[1 rows x 3 columns]


In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year  Month  Value
0     1      2      3


[1 rows x 3 columns]

注意 : 这将返回一个 DataFrame对象，并在屏幕上显示为输出，但是实际上并没有在列上设置更改。要进行更改，可以在方法链中使用这个变量，也可以重新分配 df变量:

df = df.rename(columns=lambda x: x.strip())

小开

Since 版本0.16.1 you can just call .str.strip on the columns:

df.columns = df.columns.str.strip()

这里有一个小例子:

In [5]:
df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
print(df.columns.tolist())
df.columns = df.columns.str.strip()
df.columns.tolist()


['Year', 'Month ', 'Value']
Out[5]:
['Year', 'Month', 'Value']

时机

In[26]:
df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', '  sa', ' asdas '])
df
Out[26]:
Empty DataFrame
Columns: [ year,  month ,  day,  asdas ,  asdas, as ,   sa,  asdas ]




%timeit df.rename(columns=lambda x: x.strip())
%timeit df.columns.str.strip()
1000 loops, best of 3: 293 µs per loop
10000 loops, best of 3: 143 µs per loop

所以 str.strip快了约2倍，我希望这对于较大的 dfs 来说可以扩展得更好

小开

如果使用 CSV 格式从 Excel 导出并读取为 Panda DataFrame，则可以指定:

skipinitialspace=True

打 pd.read_csv的时候。

From the 文件:

Skipinitialspace: bool，default False
Skip spaces after delimiter.

小开

如果你正在寻找一种牢不可破的方法来做到这一点，我建议你:

data_frame.rename(columns=lambda x: x.strip() if isinstance(x, str) else x, inplace=True)

小开

实际上可以做到这一点

df.rename(str.strip, axis = 'columns')

熊猫文档中有提到这里。