对数据帧进行分组并得到和和计数?

我有一个数据框架,看起来像这样:

              Company Name              Organisation Name  Amount
10118  Vifor Pharma UK Ltd  Welsh Assoc for Gastro & Endo 2700.00
10119  Vifor Pharma UK Ltd    Welsh IBD Specialist Group,  169.00
10120  Vifor Pharma UK Ltd             West Midlands AHSN 1200.00
10121  Vifor Pharma UK Ltd           Whittington Hospital   63.00
10122  Vifor Pharma UK Ltd                 Ysbyty Gwynedd   75.93

How do I sum the Amount and count the Organisation Name, to get a new dataframe that looks like this?

              Company Name             Organisation Count   Amount
10118  Vifor Pharma UK Ltd                              5 11000.00

我知道如何计算 或者计数:

df.groupby('Company Name').sum()
df.groupby('Company Name').count()

但不知道如何两者兼顾!

174912 次浏览

试试这个:

In [110]: (df.groupby('Company Name')
.....:    .agg({'Organisation Name':'count', 'Amount': 'sum'})
.....:    .reset_index()
.....:    .rename(columns={'Organisation Name':'Organisation Count'})
.....: )
Out[110]:
Company Name   Amount  Organisation Count
0  Vifor Pharma UK Ltd  4207.93                   5

或者如果你不想重置索引:

df.groupby('Company Name')['Amount'].agg(['sum','count'])

or

df.groupby('Company Name').agg({'Amount': ['sum','count']})

演示:

In [98]: df.groupby('Company Name')['Amount'].agg(['sum','count'])
Out[98]:
sum  count
Company Name
Vifor Pharma UK Ltd  4207.93      5


In [99]: df.groupby('Company Name').agg({'Amount': ['sum','count']})
Out[99]:
Amount
sum count
Company Name
Vifor Pharma UK Ltd  4207.93     5

如果你有很多专栏,只有一个不同,你可以这样做:

In[1]: grouper = df.groupby('Company Name')
In[2]: res = grouper.count()
In[3]: res['Amount'] = grouper.Amount.sum()
In[4]: res
Out[4]:
Organisation Name   Amount
Company Name
Vifor Pharma UK Ltd                  5  4207.93

注意,您可以随意重命名“组织名称”列。

df.groupby('Company Name').agg({'Organisation name':'count','Amount':'sum'})\
.apply(lambda x: x.sort_values(['count','sum'], ascending=False))

Just in case you were wondering how to rename columns during aggregation, here's how for

pandas >= 0.25: 命名为聚集

df.groupby('Company Name')['Amount'].agg(MySum='sum', MyCount='count')

或者,

df.groupby('Company Name').agg(MySum=('Amount', 'sum'), MyCount=('Amount', 'count'))

                       MySum  MyCount
Company Name
Vifor Pharma UK Ltd  4207.93        5