将多列除以熊猫中的另一列

除了数据框架中的第一列之外,我需要将其他所有列除以第一列。

这就是我正在做的,但我不知道这是不是“正确的”熊猫方式:

df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))


df[['B', 'C']] = (df.T.iloc[1:] / df.T.iloc[0]).T

有没有办法做类似于 df[['B','C']] / df['A']的事情? (这只是给出了一个10x12的 nan数据帧。)

另外,在阅读了关于 SO 的一些类似问题之后,我尝试了 df['A'].div(df[['B', 'C']]),但它给出了一个广播错误。

120242 次浏览

I believe df[['B','C']].div(df.A, axis=0) and df.iloc[:,1:].div(df.A, axis=0) work.

do: df.iloc[:,1:] = df.iloc[:,1:].div(df.A, axis=0)

This will divide all columns other than the 1st column with the 'A' column used as divisor.

Results are 1st column + all columns after / 'divisor column'.

You are actually doing a matrix multiplication (Apparently numpy understands that "/" operator multiplies by the inverse), so you need the shapes to match (see here).

e.g.

df['A'].shape --> (10,)
df[['B','C']].shape --> (10,2)

You should make them match as (2,10)(10,):
df[['B','C']].T.shape, df['A'].shape -->((2, 10), (10,))

But then your resulting matrix is: ( df[['B','C']].T / df['A'] ).shape --> (2,10)

Therefore:

( df[['B','C']].T / df['A'] ).T

Shape is (10,2). It gives you the results that you wanted!