pandas 在多个 Dataframe 中拆分和展平 Dataframe

50pmv0ei  于 2023-08-01  发布在  其他
关注(0)|答案(1)|浏览(115)

我试图从一个 Dataframe 派生多个 Dataframe ,如下所示:
D_input:

import pandas as pd
from numpy import nan

data = {'ID': {0: 'id1', 1: 'id1', 2: 'id1', 3: 'id1', 4: 'id1', 5: 'id1'}, 'hr': {0: 55, 1: 56, 2: 57, 3: 75, 4: 65, 5: 55}, 'hrMax': {0: nan, 1: 60.0, 2: 59.0, 3: nan, 4: 70.0, 5: 79.0}, 'hrMin': {0: nan, 1: 45.0, 2: 45.0, 3: nan, 4: 45.0, 5: 35.0}}

df = pd.DataFrame(data)

字符串
D_output:[D1,D2]

ID    hr_a hr_b    hrMax hrMin  
 id1   55   56       60      45 
 id1   55   57       59      45 

 ID    hr_a hr_b    hrMax hrMin
 id1    75   65     70     45   
 id1    75   55     79     35


我试过了

# Select the indexes where df is NaN using hrMax
index = df['hrMax'].index[df['hrMax'].apply(np.isnan)]
df_index = df.index.values.tolist()

# get each sub-dataframe using iloc
for i in range(0, len(index)) :
    df_single_observation = df.iloc[df_index.index(i):df_index.index(i+1)-1]


但它不起作用。请问我能请求帮助吗?提前感谢。最好的问候。

iklwldmw

iklwldmw1#

试试看:

m = df[['hrMax', 'hrMin']].isna().all(axis=1)

df['hr_a'] = df.loc[m, 'hr']
df['hr_a'] = df['hr_a'].ffill()

df = df[~m].rename(columns={'hr':'hr_b'})[['ID', 'hr_a', 'hr_b', 'hrMax', 'hrMin']]

for _, g in df.groupby((m != m.shift()).cumsum()):
    print(g)
    print('-'*80)

字符串
图纸:

ID  hr_a  hr_b  hrMax  hrMin
1  id1  55.0    56   60.0   45.0
2  id1  55.0    57   59.0   45.0
--------------------------------------------------------------------------------
    ID  hr_a  hr_b  hrMax  hrMin
4  id1  75.0    65   70.0   45.0
5  id1  75.0    55   79.0   35.0
--------------------------------------------------------------------------------

相关问题