Deleting multiple columns based on column names in Pandas

I have some data and when I import it, I get the following unneeded columns. I'm looking for an easy way to delete all of these.

'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',
'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',
'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',
'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',
'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',
'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',
'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',
'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',
'Unnamed: 60'

They are indexed by 0-indexing so I tried something like

df.drop(df.columns[[22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32 ,55]], axis=1, inplace=True)

But this isn't very efficient. I tried writing some for loops but this struck me as bad Pandas behaviour. Hence i ask the question here.

I've seen some examples which are similar (Drop multiple columns in pandas) but this doesn't answer my question.

244153 次浏览

我不知道您所说的低效是什么意思,但是如果您指的是在输入方面,那么可以更容易地选择感兴趣的 Protocol,然后返回到 df:

df = df[cols_of_interest]

其中 cols_of_interest是您关心的列的列表。

或者你可以切割这些列,然后传递给 drop:

df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)

head的调用只选择0行,因为我们只对列名而不是数据感兴趣

update

另一种方法: 使用来自 str.contains的布尔掩码并反转它来掩码列会更简单:

In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df


Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []


In [4]:
~df.columns.str.contains('Unnamed:')


Out[4]:
array([ True, False, False,  True], dtype=bool)


In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]


Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []

这可能是一个做你想做的事情的好方法。它将删除标题中包含“未命名”的所有列。

for col in df.columns:
if 'Unnamed' in col:
del df[col]

下面的方法对我很有效:

for col in df:
if 'Unnamed' in col:
#del df[col]
print col
try:
df.drop(col, axis=1, inplace=True)
except Exception:
pass

By far the simplest approach is:

yourdf.drop(['columnheading1', 'columnheading2'], axis=1, inplace=True)

你可以一行一行地做:

df.drop([col for col in df.columns if "Unnamed" in col], axis=1, inplace=True)

与上面的解决方案相比,这涉及到更少的对象移动/复制。

我个人的最爱,比我在这里看到的答案(多个专栏)要简单:

df.drop(df.columns[22:56], axis=1, inplace=True)

不确定这个解决方案是否已经在任何地方提到过,但是一种方法是 pandas.Index.difference

>>> df = pd.DataFrame(columns=['A','B','C','D'])
>>> df
Empty DataFrame
Columns: [A, B, C, D]
Index: []
>>> to_remove = ['A','C']
>>> df = df[df.columns.difference(to_remove)]
>>> df
Empty DataFrame
Columns: [B, D]
Index: []

df = df[[col for col in df.columns if not ('Unnamed' in col)]]

只需将列名作为列表传递,并将轴指定为0或1

  • 轴 = 1: 沿行
  • 轴 = 0: 沿着圆柱
  • 默认情况下,坐标轴 = 0

    data.drop(["Colname1","Colname2","Colname3","Colname4"],axis=1)

删除22后的所有列。

df.drop(columns=df.columns[22:]) # love it

你可以删除所有以“未命名”开头的列:

df.loc[:, ~df.columns.str.startswith('Unnamed')]