如何在pandas Dataframe 中按特定行进行分组，并在进入下一个组之前为每个组运行代码

tzcvj98z 于 2023-04-28 发布在其他

关注(0)|答案(1)|浏览(90)

嗨，我试图使用Excel工作表，然后按其记录ID对它们进行分组，并为该记录ID运行特定代码，然后移动到下一个记录ID并运行相同的代码。
在我附加的示例图像中，我有一个名为RecordID的列，它根据QuantityDifference对行进行分组。

相信有问题上传图像，所以我写的例子在一个表
表

RecordID | securityAmount | Quantity Difference
25         7595.255          84585.651
25         7620.256          84585.651
24         22.46             75.049
24          52.589           75.049

这个项目的目标是运行一段代码，告诉我securityAmount列中的任何组合是否等于QuantityDifference
用于运行此操作的代码是：

import itertools

quantityDifference = 9751.511
securityAmount = [3584.091,
22.543,
17941.162,
16838.976,
9746.422,
21123.342
]
result = [seq for i in range (len(numbers), 0, -1) for seq in itertools.combinations(securityAmount, i) if round(sum(seq),3) == quantityDifference]
print(result)

如果我硬编码数字，这段代码就可以按照我的意图工作，但我显然希望它动态运行，所以我想用Excel工作表上的内容替换输入。
这是我目前的代码，由于我在Pandas方面的知识有限：

import numpy as np
import pandas as pd

data = pd.read_excel(r'\\my\file\.xlsx')
df = pd.DataFrame(data, columns=['securityAmount', 'Quantity Difference', 'RecordID'])

到目前为止，它是建立了我想要的，现在我需要帮助做以下逻辑（见照片作为参考，因为我通过逻辑运行）
我想根据RecordID为每个组运行上面列出的代码
还要注意，对于每个组我只需要一个数量的差异，因为他们将是完全相同的每组，但安全金额是什么不同
例如，它应该执行的第一个序列基于recordID 25，目标是查看任何数字组合是否等于数量差（securityAmount中的数字数量可能会有所不同。它可以是10个数字，两个数字等。但上面的代码将产生这个）
在RecordID 25中，没有可能的组合，因此结果将是[]，我希望看到对于RecordID 24，两个数字等于数量差，因此输出返回为[22.46，52.589]这里没有显示任何组合，但如果有10个数字，其中三个是组合，它将返回[x，y，z]
有人能帮助指导我如何让我的代码对每一个单独的组都做这个代码，一旦做了就转移到下一个组，直到它通过整个工作表？
谢谢大家！
预期结果：

securityAmount | Quantity Difference | RecordID | Combinations
50                  120                   1         [50,30,40]
27                   120                   1        [50,30,40] 
30                   120                   1        [50,30,40]
40                   120                   1        [50,30,40]
98                    50                   2        []
300                   50                   2        []

pandas

来源：https://stackoverflow.com/questions/76095399/how-would-i-group-by-a-specific-row-in-a-pandas-dataframe-and-run-a-code-for-eac

1条答案

按热度按时间

juzqafwq1#

可能有更好的方法来做到这一点，但我希望这对你有帮助：

import itertools
import pandas as pd
data = pd.DataFrame({"RecordID": [1, 1, 1, 1, 2, 2], "securityAmount": [50, 27, 30, 40, 98, 300],"Quantity Difference": [120, 120, 120, 120, 50, 50]})
combinations = []
for i in data.RecordID:
  temp = data.loc[data["RecordID"] == i]
  quantityDifference = list(temp["Quantity Difference"])[0]
  securityAmount = list(temp["securityAmount"])
  result = [seq for i in range (len(securityAmount), 0, -1) for seq in itertools.combinations(securityAmount, i) if round(sum(seq),3) == quantityDifference]
  combinations.append(result)
  print(result)
data["Combinations"] = combinations
print(data)

在代码中使用header：

import itertools
import pandas as pd
data = pd.DataFrame({"RecordID": [1, 1, 1, 1, 2, 2], "securityAmount": [50, 27, 30, 40, 98, 300],"Quantity Difference": [120, 120, 120, 120, 50, 50]})
combinations = []
for i in data[list(data.columns)[0]]:
  temp = data.loc[data[list(data.columns)[0]] == i]
  quantityDifference = list(temp[list(data.columns)[2]])[0]
  securityAmount = list(temp[list(data.columns)[1]])
  result = [seq for i in range (len(securityAmount), 0, -1) for seq in itertools.combinations(securityAmount, i) if round(sum(seq),3) == quantityDifference]
  combinations.append(result)
  print(result)
data["Combinations"] = combinations
print(data)

输出：

[(50, 30, 40)]
[(50, 30, 40)]
[(50, 30, 40)]
[(50, 30, 40)]
[]
[]
   RecordID  securityAmount  Quantity Difference    Combinations
0         1              50                  120  [(50, 30, 40)]
1         1              27                  120  [(50, 30, 40)]
2         1              30                  120  [(50, 30, 40)]
3         1              40                  120  [(50, 30, 40)]
4         2              98                   50              []
5         2             300                   50              []

赞(0）回复(0）举报 2023-04-28

我来回答

如何在pandas Dataframe 中按特定行进行分组，并在进入下一个组之前为每个组运行代码

1条答案

相关问题

热门标签

最新问答