用包含组合匹配结果的新列连接两个Pandas Dataframe

gfttwv5a  于 2022-12-09  发布在  其他
关注(0)|答案(1)|浏览(130)

Apologies if this has been answered already, but I wasn't able to find a similar post.
I've got two Pandas dataframes that I'd like to merge. Dataframe1 contains data which has failed validation. Dataframe2 contains the detail for each row where the errors have occurred (ErrorColumn).
As you can see in Dataframe2, there can be multiple errors for a single row. I need to consolidate the errors, then append them as a new column (ErrorColumn) in Dataframe1.
Example below
Dataframe 1:
| ErrorRow | MaterialID | Description | UnitCost | Quantity | Critical | Location |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 3 | nan | Part 1 | nan | 100 | false | West |
| 4 | nan | Part 2 | 12 | nan | true | East |
| 7 | 56779 | Part 3 | 25 | nan | false | West |
Dataframe 2:
| ErrorRow | ErrorColumn |
| ------------ | ------------ |
| 3 | MaterialID |
| 3 | UnitCost |
| 4 | MaterialID |
| 4 | Quantity |
| 7 | Quantity |
Result:
| ErrorRow | MaterialID | Description | UnitCost | Quantity | Critical | Location | ErrorColumn |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 3 | nan | Part 1 | nan | 100 | false | West | MaterialID, UnitCost |
| 4 | nan | Part 2 | 12 | nan | true | East | MaterialID, Quantity |
| 7 | 56779 | Part 3 | 25 | nan | false | West | Quantity |
Any assistance is appreciated. I'm new to Python, there's likely a simple solution that I have yet to find/learn.

3htmauhk

3htmauhk1#

可以将pandas.DataFrame.mergeGroupBy.agg一起使用:

out = df1.merge(df2.groupby("ErrorRow", as_index=False).agg(", ".join), on="ErrorRow")
#or if set needed, use GroupBy.agg(set)
#输出:
print(out.to_string())
​
   ErrorRow  MaterialID Description  UnitCost  Quantity  Critical Location           ErrorColumn
0         3         NaN      Part 1       NaN     100.0     False     West  MaterialID, UnitCost
1         4         NaN      Part 2      12.0       NaN      True     East  MaterialID, Quantity
2         7     56779.0      Part 3      25.0       NaN     False     West              Quantity

相关问题