给定以下数据框和数据透视表:
import pandas as pd df=pd.DataFrame({'A':['x','y','z','x','y','z'], 'B':['one','one','one','two','two','two'], 'C':[2,18,2,8,2,18]}) df A B C 0 x one 2 1 y one 18 2 z one 2 3 x two 8 4 y two 2 5 z two 18 table = pd.pivot_table(df, index=['A', 'B'],aggfunc=np.sum) C A B x one 2 two 8 y one 18 two 2 z one 2 two 18 我想在此数据透视表中添加2列;一个显示所有值的百分比,另一个显示列A中的百分比,如下所示: C % of Total % of B A B x one 2 4% 10% two 18 36% 90% y one 2 4% 20% two 8 16% 80% z one 2 4% 10% two 18 36% 90% 额外信用: 我想要一个底部摘要行,其中包含C列的总和(如果它对于接下来的2列也有100%,那就没关系,但这些都不需要).
您可以使用:
table['% of Total'] = (table.C / table.C.sum() * 100).astype(str) + '%' table['% of B'] = (table.C / table.groupby(level=0).C.transform(sum) * 100).astype(str) + '%' print table C % of Total % of B A B x one 2 4.0% 20.0% two 8 16.0% 80.0% y one 18 36.0% 90.0% two 2 4.0% 10.0% z one 2 4.0% 10.0% two 18 36.0% 90.0% 但是对于真实的数据,我认为不推荐使用int转换,更好的是使用 额外信用: table['% of Total'] = (table.C / table.C.sum() * 100) table['% of B'] = (table.C / table.groupby(level=0).C.transform(sum) * 100) table.loc['total', :] = table.sum().values print table C % of Total % of B A B x one 2.0 4.0 20.0 two 8.0 16.0 80.0 y one 18.0 36.0 90.0 two 2.0 4.0 10.0 z one 2.0 4.0 10.0 two 18.0 36.0 90.0 total 50.0 100.0 300.0 |
|