Pandas：如果列等于某个字符串值('Sell'或'Buy')，则在组的开头删除行

oalqel3c 于 2023-05-21 发布在其他

关注(0)|答案(1)|浏览(109)

为了澄清，标题中的“组”不是pd.groupby的结果。相反，我的意思是共享某些列的相同值的行。在我的情况下，它将是帐户和符号。
我试图计算利润和损失的帐户和立场从贸易数据上的先进先出（FIFO）。因此，当累计股票数量降到零以下时，也就是说，当最近的卖出股票大于之前所有买入股票的总和时，我需要将其重置为0。当交易数据以卖出记录开始时也是如此。
我正试图设计一个累计总和将重置为0，以帮助与过程。我有：

def cumsum_with_reset(group):
        cumulative_sum = 0
        group['reset_cumsum'] = 0
        for index, row in group.iterrows():
            cumulative_sum += row['Modified_Quantity']
            if cumulative_sum < 0:
                cumulative_sum = 0
            group.loc[index, 'reset_cumsum'] = cumulative_sum
        return group

如果一个组（即具有相同帐户和符号的行）以卖出记录开始，则此函数可以返回0。然而，问题是iterrows的效率太低了，它需要花费大量的数据，所以我想创建一个新的函数，但我在第一步就卡住了：如何删除每组中卖出行，然后再删除买入行？
使用一些示例数据：

pd.DataFrame(data = [['2022-01-01', 'foo', 'AMZN', 'buy', 10, 22],
 ['2022-01-02', 'foo', 'AMZN', 'sell', 15, 24],
 ['2022-01-03', 'cat', 'FB', 'sell', 5, 12],
 ['2022-01-04', 'cat', 'FB', 'buy', 17, 15],
 ['2022-01-05', 'cat', 'FB', 'sell', 15, 13],
 ['2022-01-06', 'bar', 'AAPL', 'buy', 10, 10],
 ['2022-01-07', 'bar', 'AAPL', 'buy', 5, 12],
 ['2022-01-08', 'bar', 'AAPL', 'sell', 8, 12],
 ['2022-01-09', 'bar', 'AAPL', 'sell', 12, 14],
 ['2022-01-10', 'dog', 'GOOG', 'sell', 20, 13],
 ['2022-01-11', 'dog', 'GOOG', 'buy', 15, 13],
 ['2022-01-12', 'dog', 'GOOG', 'buy', 5, 13],
 ['2022-01-13', 'dog', 'GOOG', 'sell', 7, 14]], columns = ['Date', 'account', 'symbol', 'Action', 'Quantity', 'Price'])

看起来像这样：

此数据集中有4个组：

第2和第4组从第2行和第9行的卖出记录开始。我如何使用Pandas来删除这些记录，直到每个组都以购买记录开始？

pandas

来源：https://stackoverflow.com/questions/76285582/pandas-drop-rows-at-beginning-of-groups-if-a-column-equals-certain-string-value

1条答案

按热度按时间

zf9nrax11#

如果在一个组的开始没有一个以上的卖出，这是相当微不足道的：

# assuming df is sorted by symbol + date
df.loc[(df['symbol'] == df['symbol'].shift()) | (df['Action'] != 'sell')]

如果要删除多个结果卖出，我们需要跟踪前一行删除的状态：

last = False
df.loc[[not (last := (action == 'sell' and (last or current != prev)))  
        for action, current, prev  
        in zip(df['Action'], df['symbol'], df['symbol'].shift())]]

赞(0）回复(0）举报 2023-05-21

我来回答

Pandas：如果列等于某个字符串值('Sell'或'Buy')，则在组的开头删除行

1条答案

相关问题

热门标签

最新问答