How to simply add a column level to a pandas dataframe

let say I have a dataframe that looks like this:

df = pd.DataFrame(index=list('abcde'), data={'A': range(5), 'B': range(5)})
df
Out[92]:
A  B
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4

Asumming that this dataframe already exist, how can I simply add a level 'C' to the column index so I get this:

 df
Out[92]:
A  B
C  C
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4

I saw SO anwser like this python/pandas: how to combine two dataframes into one with hierarchical column index? but this concat different dataframe instead of adding a column level to an already existing dataframe.

-

66904 次浏览

As suggested by @StevenG himself, a better answer:

df.columns = pd.MultiIndex.from_product([df.columns, ['C']])


print(df)
#    A  B
#    C  C
# a  0  0
# b  1  1
# c  2  2
# d  3  3
# e  4  4

option 1
set_index and T

df.T.set_index(np.repeat('C', df.shape[1]), append=True).T

option 2
pd.concat, keys, and swaplevel

pd.concat([df], axis=1, keys=['C']).swaplevel(0, 1, 1)

enter image description here

Another way for MultiIndex (appanding 'E'):

df.columns = pd.MultiIndex.from_tuples(map(lambda x: (x[0], 'E', x[1]), df.columns))


A  B
E  E
C  D
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4

A solution which adds a name to the new level and is easier on the eyes than other answers already presented:

df['newlevel'] = 'C'
df = df.set_index('newlevel', append=True).unstack('newlevel')


print(df)
#           A  B
# newlevel  C  C
# a         0  0
# b         1  1
# c         2  2
# d         3  3
# e         4  4

You could just assign the columns like:

>>> df.columns = [df.columns, ['C', 'C']]
>>> df
A  B
C  C
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4
>>>

Or for unknown length of columns:

>>> df.columns = [df.columns.get_level_values(0), np.repeat('C', df.shape[1])]
>>> df
A  B
C  C
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4
>>>

I like it explicit (using MultiIndex) and chain-friendly (.set_axis):

df.set_axis(pd.MultiIndex.from_product([df.columns, ['C']]), axis=1)

This is particularly convenient when merging DataFrames with different column level numbers, where Pandas (1.4.2) raises a FutureWarning (FutureWarning: merging between different levels is deprecated and will be removed ... ):

import pandas as pd


df1 = pd.DataFrame(index=list('abcde'), data={'A': range(5), 'B': range(5)})
df2 = pd.DataFrame(index=list('abcde'), data=range(10, 15), columns=pd.MultiIndex.from_tuples([("C", "x")]))


# df1:
A  B
a  0  0
b  1  1


# df2:
C
x
a  10
b  11


# merge while giving df1 another column level:
pd.merge(df1.set_axis(pd.MultiIndex.from_product([df1.columns, ['']]), axis=1),
df2,
left_index=True, right_index=True)


# result:
A  B   C
x
a  0  0  10
b  1  1  11




Another method, but using a list comprehension of tuples as the arg to pandas.MultiIndex.from_tuples():

df.columns = pd.MultiIndex.from_tuples([(col, 'C') for col in df.columns])


df
A  B
C  C
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4