pandas 值的分布以固定百分比为基础,并将值垂直相加

v8wbuo2f  于 11个月前  发布在  其他
关注(0)|答案(1)|浏览(92)

我已经在Excel中创建了下面的表,现在我希望在Python中复制相同的内容。

  • 已结帐目按

    栏内其后各月的固定百分率作进一步分配

我希望在python中创建类似的框架,并开始使用以下方法:

data = {
    'Assigned Accounts': [1428, 1415, 1398, 1402, 1468, 1503, 1694],
    'Month': ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22']
}

df = pd.DataFrame(data)

df['Settled Accounts'] = int(data['Assigned Accounts']*0.17)

# Fixed percentage distribution
percentages = [0.46, 0.36, 0.06, 0.03, 0.02, 0.01, 0.01]

# Creating columns for subsequent months and distributing the settled debtors
months = ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22', 'Jan-23', 'Feb-23', 'Mar-23', 'Apr-23', 'May-23', 'Jun-23']

how it could be approached further.?

字符串
我正在寻找最后的框架,使垂直列的总和如下:

Month        Total
Jan - 23     350
Feb - 23     335
Mar - 23     343
Apr - 23     341 and so on

更新问题:

对于下面的图表,方法将发生什么变化:

data = {
    'Assigned Accounts': [674, 779, 812, 735, 753, 677, 628, 589, 570, 581, 550, 595, 596, 659, 754, 731, 739, 741, 766, 796, 752, 821, 818, 868, 850, 909, 927, 915],
    'Month': ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22', 'Jan-23', 'Feb-23', 'Mar-23', 'Apr-23', 'May-23', 'Jun-23', 'Jul-23', 'Aug-23', 'Sep-23', 'Oct-23', 'Nov-23', 'Dec-23', 'Jan-24', 'Feb-24', 'Mar-24', 'Apr-24', 'May-24', 'Jun-24', 'Jul-24', 'Aug-24', 'Sep-24']
}

df = pd.DataFrame(data)

months: ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22', 'Jan-23', 'Feb-23', 'Mar-23', 'Apr-23', 'May-23', 'Jun-23', 'Jul-23', 'Aug-23', 'Sep-23', 'Oct-23', 'Nov-23', 'Dec-23', 'Jan-24', 'Feb-24', 'Mar-24', 'Apr-24', 'May-24', 'Jun-24', 'Jul-24', 'Aug-24', 'Sep-24']


其余的问题陈述将保持不变。

ss2ws0br

ss2ws0br1#

如果我理解正确的话,您希望将每一行的值乘以百分比,并每月移动这些值。
您可以使用numpybroadcasting轻松完成此操作。

  • 注意:在你的图片中,看起来你使用了~23%的因子(而不是17%),我在下面使用它来更好地比较数字。
# 23% of initial value
df['Settled Accounts'] = df['Assigned Accounts'].mul(0.23).astype(int)

# create output array
tmp = np.full((len(df), len(months)), np.nan)
# indexer for rows
a = np.arange(len(df))
# indexer for shifted values
col = (a[:,None] + np.arange(len(percentages))).ravel()
m = col<len(a)
idx = np.repeat(a, len(percentages))[m]

# broadcast the multiplication by "percent"
tmp[idx, col[m]] =  (df['Settled Accounts'].to_numpy()[:,None]
                     * np.array(percentages)
                    ).ravel()[m].astype(int)

# convert to DataFrame
tmp = pd.DataFrame(tmp, index=df.index, columns=months)
# add sum as new row
tmp.loc['sum'] = tmp.sum()

out = df.join(tmp, how='outer')

字符串
输出量:

Assigned Accounts   Month  Settled Accounts  Jun-22  Jul-22  Aug-22  Sep-22  Oct-22  Nov-22  Dec-22  Jan-23  Feb-23  Mar-23  Apr-23  May-23  Jun-23
0               1428.0  Jun-22             328.0   150.0   118.0    19.0     9.0     6.0     3.0     3.0     NaN     NaN     NaN     NaN     NaN     NaN
1               1415.0  Jul-22             325.0     NaN   149.0   117.0    19.0     9.0     6.0     3.0     3.0     NaN     NaN     NaN     NaN     NaN
2               1398.0  Aug-22             321.0     NaN     NaN   147.0   115.0    19.0     9.0     6.0     3.0     3.0     NaN     NaN     NaN     NaN
3               1402.0  Sep-22             322.0     NaN     NaN     NaN   148.0   115.0    19.0     9.0     6.0     3.0     3.0     NaN     NaN     NaN
4               1468.0  Oct-22             337.0     NaN     NaN     NaN     NaN   155.0   121.0    20.0    10.0     6.0     3.0     3.0     NaN     NaN
5               1503.0  Nov-22             345.0     NaN     NaN     NaN     NaN     NaN   158.0   124.0    20.0    10.0     6.0     3.0     3.0     NaN
6               1694.0  Dec-22             389.0     NaN     NaN     NaN     NaN     NaN     NaN   178.0   140.0    23.0    11.0     7.0     3.0     3.0
sum                NaN     NaN               NaN   150.0   267.0   283.0   291.0   304.0   316.0   343.0   182.0    45.0    23.0    13.0     6.0     3.0

相关问题