Pandas Dataframe ，在一行中，查找所选列中的最大值，并根据该值查找另一列的值

jpfvwuh4 于 2022-12-28 发布在其他

关注(0)|答案(6)|浏览(288)

我有这样一个 Dataframe ：

>>>import pandas as pd
>>>df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})

>>>df
      x1  y1  x2  y2
   0  20   5  22  10
   1  25   8  27   2
>>>

X和Y配对在一起。我需要比较y1和y2，得到每一行的最大值。然后找到对应的x。因此，row [0]的最大值是y2（= 10），对应的x是x2（= 22）。第二行将是y1（= 8）和x1（= 25）。预期结果，新列x和y：

x1  y1  x2  y2   x   y
0  20   5  22  10  22  10
1  25   8  27   2  25   8

这是一个简单的数据框架，我做了详细的问题。X和Y对，在我的情况下，可以是30对。

pandas

来源：https://stackoverflow.com/questions/74918325/pandas-dataframe-in-a-row-to-find-the-max-in-selected-column-and-find-value-o

6条答案

按热度按时间

flmtquvp1#

# get a hold on "y*" columns
y_cols = df.filter(like="y")

# get the maximal y-values' suffixes, and then add from front "x" to them
max_x_vals = y_cols.idxmax(axis=1).str.extract(r"(\d+)$", expand=False).radd("x")
# get the locations of those x* values
max_x_ids = df.columns.get_indexer(max_x_vals)

# now we have the indexes of x*'s in the columns; NumPy's indexing
# helps to get a cross section
df["max_xs"] = df.to_numpy()[np.arange(len(df)), max_x_ids]

# for y*'s, it's directly the maximum per row
df["max_ys"] = y_cols.max(axis=1)

得到

>>> df

   x1  y1  x2  y2  max_xs  max_ys
0  20   5  22  10      22      10
1  25   8  27   2      25       8

赞(0）回复(0）举报 2022-12-28

olqngx592#

你可以在.apply函数的帮助下完成。

import pandas as pd
import numpy as np

df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})
y_cols = [col for col in df.columns if col[0] == 'y'] 
x_cols = [col for col in df.columns if col[0] == 'x'] 

def find_corresponding_x(row):
    max_y_index = np.argmax(row[y_cols])
    return row[f'{x_cols[max_y_index]}']

df['corresponding_x'] = df.apply(find_corresponding_x, axis = 1)

赞(0）回复(0）举报 2022-12-28

eit6fx6z3#

你可以使用下面的函数。记得像我在这段代码中那样导入Pandas和小可爱。导入你的数据集并使用Max_number函数。

import pandas as pd
import numpy as np
df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})

def Max_number (df):
    columns = list(df.columns)
    rows = df.shape[0]
    max_value = []
    column_name = []

    for i in range(rows):
        row_array = list(np.array(df[i:i+1])[0])
        maximum = max(row_array)
        max_value.append(maximum)
        index=row_array.index(maximum)
        column_name.append(columns[index])
    
    return pd.DataFrame({"column":column_name,"max_value":max_value})

返回以下内容：
| 行索引|管柱|最大值|
| - ------| - ------| - ------|
| 无|x2|二十二|
| 1个|x2|二十七|

赞(0）回复(0）举报 2022-12-28

ghhaqwfi4#

如果x1列先出现，然后是y1，然后是x2、y2，依此类推，您可以尝试：

a = df.columns.get_indexer(y_cols.idxmax(axis=1))
df[['y', 'x']] = df.to_numpy()[np.arange(len(df)), [a, a - 1]].T

赞(0）回复(0）举报 2022-12-28

qncylg1j5#

这是一种解决方案：

a = df[df['y1'] < df['y2']].drop(columns=['y1','x1']).rename(columns={'y2':'y', 'x2':'x'})
b = df[df['y1'] >= df['y2']].drop(columns=['y2','x2']).rename(columns={'y1':'y', 'x1':'x'})

result = pd.concat([a,b])

如果您需要保持顺序，那么可以添加另一个具有原始索引的列，并在连接后按其排序

赞(0）回复(0）举报 2022-12-28

a8jjtwal6#

希望你的解决方案能奏效，

import pandas as pd
df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})
df['x_max'] = df[['x1', 'x2']].max(axis=1)
df['y_max'] = df[['y1', 'y2']].max(axis=1)
df

赞(0）回复(0）举报 2022-12-28

我来回答

Pandas Dataframe ，在一行中，查找所选列中的最大值，并根据该值查找另一列的值

6条答案

相关问题

热门标签

最新问答