Python Pandas：我如何从一个不同的dataframe和一个numpy vector之间的匹配位置生成一个dataframe？

u5rb5r59 于 2023-05-27 发布在 Python

关注(0)|答案(1)|浏览(95)

y_batch是一个Dataframe，它包含了一个用数字表示genericarticleid的列表。label_vector是一个3d numpy vector，填充了genericarticleid、verschleissbehavtette、verschleissteil。我需要生成一个与y_batch Dataframe 具有相同索引的 Dataframe ，并且具有label_vector.shape[0]的列号。除y_batch和label_vector匹配的genericarticleids位置外，每一行都应该为零。
我设法用下面的代码获得了desire输出：

import numpy as np
import pandas as pd

# Example data
y_batch = pd.DataFrame([[1, 2], [3, 4, 5], [6]])
label_vector = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
label_ids = label_vector[:, 0].astype(int)

# Initialize the result DataFrame with zeros and matching indices
result_vector = pd.DataFrame(0, index=y_batch.index, columns=range(label_vector.shape[0]))

# Iterate over each row in y_batch
for i, row in y_batch.iterrows():
    # Find the matching indices in label_ids
    indices = np.isin(label_ids, row)
    # Update the result DataFrame at the matching indices
    result_vector.loc[i, indices] = 1

print(result_vector)

我不明白的是如何消除这段代码中的for循环。我的数据量很大，我想保存时间。
编辑：假设我的数据有这样的形式。

genericarticleid_list = [[1, 2], [3, 4, 5], [6, 7, 8, 9]]

y_batch = pd.DataFrame({'genericarticleid': genericarticleid_list})
label_vector = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]]])
label_ids = label_vector[:, 0].astype(int)

基本上，我想要一个 Dataframe ，它的列号是label_vector行。在这种情况下，应该有3列（对于genericarticleids 1，4，7）。代码应该检查y_batch中的每个列表和label_vector中的（1，4，7）之间的匹配，如果有像第一个列表（1，2）中的匹配，它应该具有值1 0 0。对于新 Dataframe 中的其余行，其余列表的逻辑相同。有道理吗？

pandas

来源：https://stackoverflow.com/questions/76314424/python-pandas-how-can-i-generate-a-dataframe-out-of-matching-positions-between

1条答案

按热度按时间

ar7v8xwq1#

逻辑不清楚，特别是使用label_vector索引初始化输出DataFrame的列的方式。
然而，循环的逻辑可以被替换为：

out = y_batch.apply(lambda c: c.isin(label_ids)).astype(int)

numpy：

out = pd.DataFrame(np.isin(y_batch, label_ids).astype(int),
                   index=y_batch.index, columns=y_batch.columns)

输出：

赞(0）回复(0）举报 2023-05-27

我来回答

Python Pandas：我如何从一个不同的dataframe和一个numpy vector之间的匹配位置生成一个dataframe？

1条答案

相关问题

热门标签

最新问答