z9smfwbn

z9smfwbn1#

样本DF

Start    Time   End      ID
  0    43500    60     43560    23
  1    43560    60     43620    23 
  2    43620    1020   44640    24 
  3    44640    260    44900    24
  4    44900    2100   47000    24

代码:

a = df["ID"].tolist()
arr = [] 
t = True
for i in sorted(list(set(a))):
    j = 1
    k = 0
    temp = {}
    tempdf = df[df["ID"] == i]
    temp["Start"] = tempdf.iloc[k]["Start"]
    temp["Time"] = tempdf.iloc[k]["Time"]
    temp["End"] = tempdf.iloc[k]["End"]   
    temp["ID"] = tempdf.iloc[k]["ID"]

    while j < len(tempdf):
        if temp["End"] == tempdf.iloc[j]["Start"]:
            temp["End"] = tempdf.iloc[j]["End"]
            temp["Time"] += tempdf.iloc[j]["Time"] 
         j += 1  
        
    arr.append(temp)       

df = pd.DataFrame(arr)

输出DF:

Start   Time    End     ID
   0    43500   120     43620   23   
   1    43620   3380    47000   24
gz5pxeao

gz5pxeao2#

我不知道你的数据是如何格式化的,但你可以直接替换。我建议你使用numpy,并尝试沿着内容:

i=0
while i != len(data):
    if data[i][4] == data[i+1][2]:
        data[i][4] = data[i+1][2]
        data[i+1].pop
    else :
        i+=1

相关问题