按条件移动Pandas数据框中的行?

7lrncoxx  于 2023-08-01  发布在  其他
关注(0)|答案(3)|浏览(95)

我有以下数据框架:
将pandas导入为pd

data = [['Construction', '', '01/02/2022', '01/06/2022', '1', 'No'], ['Level Site', 'Construction', '01/02/2022', '01/02/2022', '2', 'No'], ['Foundation', '', '01/03/2023', '01/06/2023', '1', 'Yes'],['Lay Foundation', 'Construction>Foundation', '01/03/2022', '01/04/2022', '3', 'No'], ['Prepare land for foundation', 'Construction>Foundation', '01/05/2022', '01/06/2022', '3', 'No'],['Building Envelope', '', '01/07/2023', '01/16/2023', '1', 'No'], ['Install Footings', 'Building Envelope', '01/07/2022', '01/07/2022', '2', 'Yes'], ['Pouring', '', '01/08/202', '01/09/2023', '1', 'No'],['Pour Foundation', 'Building Envelope>Pouring', '01/08/2022', '01/09/2022', '3', 'No'], ['Installation', '', '01/09/2022', '01/14/2022', '1', 'No']]
df1 = pd.DataFrame(data, columns=['Activity', 'Parent', 'Start', 'Finish', 'WBS Level', 'Match'])

df1

字符串

理想的数据框架输出

data = [['Construction', '', '01/02/2022', '01/06/2022', '1', 'No'],['Foundation', '', '01/03/2023', '01/06/2023', '1', 'Yes'], ['Level Site', 'Construction', '01/02/2022', '01/02/2022', '2', 'No'], ['Lay Foundation', 'Construction>Foundation', '01/03/2022', '01/04/2022', '3', 'No'], ['Prepare land for foundation', 'Construction>Foundation', '01/05/2022', '01/06/2022', '3', 'No'],['Install Footings', 'Building Envelope', '01/07/2022', '01/07/2022', '2', 'Yes'],['Building Envelope', '', '01/07/2023', '01/16/2023', '1', 'No'], ['Pouring', '', '01/08/202', '01/09/2023', '1', 'No'],['Pour Foundation', 'Building Envelope>Pouring', '01/08/2022', '01/09/2022', '3', 'No'], ['Installation', '', '01/09/2022', '01/14/2022', '1', 'No']]
df2 = pd.DataFrame(data, columns=['Activity', 'Parent', 'Start', 'Finish', 'WBS Level', 'Match'])

df2


我正在准备这些数据,以便在调度软件应用程序中使用,并且需要根据某些条件对行进行重新排序。我为此创建了'match'列(我已经创建了我的条件,任何为'yes'的行都满足了条件)。
对于任何在'match'列中具有'yes'值的行,我想向上移动一行。我试过.shift方法的变化,但我有困难得到正确。我不想删除或覆盖任何行,我只需要将任何“yes”行向上移动1。
谢谢你的帮助

7xllpg7q

7xllpg7q1#

IIUC和你的输入数据框有默认的范围索引,那么你可以通过从每个索引中减去1.5,如果'是',并通过新索引重新排序数据框来实现:

df1.set_index(df1.index-np.where(df1['Match'] =='Yes', 1.5, 0)).sort_index().reset_index(drop=True)

字符串
输出量:

Activity                     Parent       Start      Finish WBS Level Match
0                 Construction                             01/02/2022  01/06/2022         1    No
1                   Foundation                             01/03/2023  01/06/2023         1   Yes
2                   Level Site               Construction  01/02/2022  01/02/2022         2    No
3               Lay Foundation    Construction>Foundation  01/03/2022  01/04/2022         3    No
4  Prepare land for foundation    Construction>Foundation  01/05/2022  01/06/2022         3    No
5             Install Footings          Building Envelope  01/07/2022  01/07/2022         2   Yes
6            Building Envelope                             01/07/2023  01/16/2023         1    No
7                      Pouring                              01/08/202  01/09/2023         1    No
8              Pour Foundation  Building Envelope>Pouring  01/08/2022  01/09/2022         3    No
9                 Installation                             01/09/2022  01/14/2022         1    No

q5iwbnjs

q5iwbnjs2#

这里有一个使用索引而不是使用.shift()交换所需行的解决方案(因为我不清楚如何在groupby()中执行此操作)。可能无法很好地扩展,但应该在较小的数据集上完成这项工作。

df1 = df1.reset_index(drop=True)  # ensure index is unique

# Loop through only the indices of rows to be shifted, to avoid looping through every row
shift_indices = df1[df1['Match'] == 'Yes'].index
for shift_idx in shift_indices:
    # No need to shift if at the top
    if shift_idx == 0:
        continue
    above_idx = shift_idx - 1
    above_row = df1.loc[above_idx].copy()  # copy as otherwise this row will change during the shift
    # If the row above is also a match, then no need to swap it
    if above_row['Match'] != 'Yes':
        shift_row = df1.loc[shift_idx]
        df1.loc[above_idx] = shift_row
        df1.loc[shift_idx] = above_row

字符串

34gzjxbg

34gzjxbg3#

mask = df1['Match'] == 'Yes'
df1.loc[mask, 'Activity':'Match'], df1.loc[mask.shift(-1, fill_value=False), 'Activity':'Match'] = df1.loc[mask.shift(-1, fill_value=False), 'Activity':'Match'].values, df1.loc[mask, 'Activity':'Match'].values

字符串
这里有另一种方法,它不是很快,但仍然可以完成工作:

for i, row in df1.iterrows():
    if row['Match'] == 'Yes':
        print(i)
        df1.iloc[i], df1.iloc[i-1] =  df1.iloc[i-1].copy(), df1.iloc[i].copy()

相关问题