带条件的Pandas Dataframe 行操作

3okqufwl  于 2022-10-23  发布在  其他
关注(0)|答案(2)|浏览(141)

我有一个数据框,里面有一只股票的信息,如下所示:
|产品ID |初始库存|初始单位成本|参考|数量|单位成本|当前库存|
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
|a|5|22|采购|4 | 24 | 9 |
|a|5|22|采购|8 | 21 | 17 |
|a|5|22|销售|-4|25|13|
|a|5|22|采购|10 | 20 | 23 |
|a|5|22|销售|-15|22|8|
|b|14|3.5|销售|10|4|4|
|b|14|3.5|采购|20 | 3 | 24 |
|b|14|3.5|销售|5|4|19|
|b|14|3.5|采购|2 | 3.5 | 21 |
|c|27|1|采购|100 | 0.95 | 127 |
|c|27|1|采购|3 | 1.1 | 130 |
每一行代表特定产品的购买/销售。Quantity表示以给定的Unit cost购买/出售的单位数量。Current stock是购买/出售后的剩余库存。对于每种产品,我想计算每次销售/购买后的加权平均成本(WAC)。程序如下:

  • 对于每个乘积的第一行,WAC = (Initial stock * Initial unit cost + Quantity * Unit cost) / Current stock正好等于Reference == 'Purch.'。如果不是,则为WAC = Initial unit cost
  • 对于下一行,WAC[i] = (Current stock[i-1] * WAC[i-1] + Quantity[i] * Unit cost[i]) / Current stock[i]就像Reference[i] == 'Purch.'一样。如果不是,则WAC[i] = WAC[i-1]

下表显示了我要查找的内容(WAC列以及如何计算):
|产品ID |初始库存|初始单位成本|参考|数量|单位成本|当前库存|(如何)WAC | WAC|
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
|a|5|22|采购|4 | 24 | 9 | (522 + 424)/9 | 22.89 |
|a|5|22|采购|8 | 21 | 17 | (922.89 + 821)/17 | 22 |
|a |5|22|销售|-4|25|13|-|22|
|a|5|22|采购|10 | 20 | 23 | (1322 + 1020)/23 | 21.13 |
|a |5|22|销售|-15|22|8|-21.13|
|b|14|3.5|销售|10|4|4|-|3.5|
|b|14|3.5|采购|20 | 3 | 24 | (43.5 + 203)/24 | 3.08 |
|b|14|3.5|销售|5|4|19|-|3.08|
|b|14|3.5|采购|2 | 3.5 | 21 | (193.08 + 23.5)/21 | 3.12 |
|c|27|1|采购|100 | 0.95 | 127 | (271 + 1000.95)/127 | 0.96 |
|c|27|1|采购|3 | 1.1 | 130 | (1270.96 + 31.1)/130 | 0.96 |
你会如何使用Pandas?我尝试过使用groupby和cumsum,但我不知道如何介绍“if”语句。之后,我想总结一下信息,得到Product ID以及最终的StockWAC,如下所示:
|产品ID |当前库存| WAC|
| ------------ | ------------ | ------------ |
|a |8 |21.13|
|b | 21 | 3.12|
|c | 130 | 0.96|
提前谢谢你!

9wbgstp7

9wbgstp71#

您可以在groupby Dataframe 中创建一个函数并使用apply调用它。
我想试试这样的东西

def calc_wac(df_):
  df_ = df_.copy()
  cs_wac = 0
  for counter, row in enumerate(df_.iterrows()):
    idx,row = row
    if counter==0:
      if row['Reference'] == 'Purch.':
        cs_wac += row['Initial stock'] * row['Initial unit cost'] + row['Quantity'] * row['Unit cost']
      else:
        cs_wac += row['Current stock'] * row['Initial unit cost']
    elif row['Reference'] == 'Purch.':
      cs_wac += row['Quantity'] * row['Unit cost']
    else:
      cs_wac *= row['Current stock']/df.loc[idx-1,'Current stock']
    df_.loc[idx, 'WAC'] = cs_wac/row['Current stock']
  return pd.Series({'Current stock': row['Current stock'], 'WAC':cs_wac/row['Current stock']})

这将在您呼叫时返回摘要信息:
df.groupby('Product ID').apply(calc_wac)以上
如果想要完整的 Dataframe ,只需更改函数返回到整个 Dataframe return df_

5n0oy7gb

5n0oy7gb2#

希望我能正确理解你的问题。
代码:


# Create new columns using lambda function

df['(how to)WAC']= df.apply(lambda row: (row['Intial stock']*row['Intial unit cost']+row['Quantity']*row['Unit cost'])/row['Current stock'] if row['Reference']=='Purch' else None, axis=1)

# Creating another column WAC, here it will gonna take data from '(how to)WAC' column.

# More, if its None will will take the above value. and if its the first value then it will take from Initial

df['WAC']  = df.groupby(['Product ID'])['(how to) WAC'].ffill().fillna(df['Initial unit cost'])

# Group by the ID and display the last rows of each

df.groupby('Product ID').tail(1)[['Product ID','Current stock', 'WAC']]#

相关问题