如何在熊猫中按数据框分组并保留列

小开

最佳答案

你想要以下的东西:

In [20]:
df.groupby(['Name','Type','ID']).count().reset_index()


Out[20]:
Name   Type  ID  Count
0  Book1  ebook   1      2
1  Book2  paper   2      2
2  Book3  paper   3      1

在您的情况下,’名称’,’类型’和’ID’在值上匹配，所以我们可以 groupby对这些，调用 count，然后 reset_index。

另一种方法是使用 transform添加“ Count”列，然后调用 drop_duplicates:

In [25]:
df['Count'] = df.groupby(['Name'])['ID'].transform('count')
df.drop_duplicates()


Out[25]:
Name   Type  ID  Count
0  Book1  ebook   1      2
1  Book2  paper   2      2
2  Book3  paper   3      1

小开

我觉得 as_index=False应该可以。

df.groupby(['Name','Type','ID'], as_index=False).count()

小开

If you have many columns in a df it makes sense to use df.groupby(['foo']).agg(...), see 给你. The .agg() function allows you to choose what to do with the columns you don't want to apply operations on. If you just want to keep them, use .agg({'col1': 'first', 'col2': 'first', ...}. Instead of 'first', you can also apply 'sum', 'mean' and others.