I have below dataframe. Where START+TIME=END I want ti check id END of current row = START of next row then merge that 2 rows providing "ID" hsould the same所以输出应该是So the output is as below
z9smfwbn1#
样本DF
Start Time End ID 0 43500 60 43560 23 1 43560 60 43620 23 2 43620 1020 44640 24 3 44640 260 44900 24 4 44900 2100 47000 24
代码:
a = df["ID"].tolist() arr = [] t = True for i in sorted(list(set(a))): j = 1 k = 0 temp = {} tempdf = df[df["ID"] == i] temp["Start"] = tempdf.iloc[k]["Start"] temp["Time"] = tempdf.iloc[k]["Time"] temp["End"] = tempdf.iloc[k]["End"] temp["ID"] = tempdf.iloc[k]["ID"] while j < len(tempdf): if temp["End"] == tempdf.iloc[j]["Start"]: temp["End"] = tempdf.iloc[j]["End"] temp["Time"] += tempdf.iloc[j]["Time"] j += 1 arr.append(temp) df = pd.DataFrame(arr)
输出DF:
Start Time End ID 0 43500 120 43620 23 1 43620 3380 47000 24
gz5pxeao2#
我不知道你的数据是如何格式化的,但你可以直接替换。我建议你使用numpy,并尝试沿着内容:
i=0 while i != len(data): if data[i][4] == data[i+1][2]: data[i][4] = data[i+1][2] data[i+1].pop else : i+=1
2条答案
按热度按时间z9smfwbn1#
样本DF
代码:
输出DF:
gz5pxeao2#
我不知道你的数据是如何格式化的,但你可以直接替换。我建议你使用numpy,并尝试沿着内容: