csv 每次从给定的dataframe进行合并时,我都希望添加一个表示合并的列

watbbzwu  于 2023-06-19  发布在  其他
关注(0)|答案(2)|浏览(96)

我是pandas的新手,我正试图找到一种方法将我的交易历史添加到我的比特币数据框中。我使用了3个dataframes:bitcoin_df(每分钟数据),buy(可能是任何时间),sell(可能是任何时间)。我这样做是为了创建一个比特币的图表,上面也有我的交易历史。
最初,我必须通过四舍五入到最近的秒来转换买入和卖出 Dataframe 的格式。正如您将在输出中看到的那样,bitcoin_df dataframe从每分钟的第0秒获取信息。
我已经连接了数据,还创建了两列。我甚至写了计算机将字符串添加到右边列的路径。现在,我只需要将列设置为“buy”,每当它与buys Dataframe 合并时,“sell”用于sells Dataframe ,而bitcoin_df Dataframe 则没有。
下面是我用来创建数据框的csv文件中的一些示例数据:bitcoin_df.csv:

Open time,Open,High,Low,Close,Volume,Date
YYYY-MM-DD 00:00:00,39145.21,39226.08,39145.21,39207.17,130.24191,2021-01-15
2021-01-15 00:01:00,39213.18,39213.18,39023.24,39031.95,82.159228,2021-01-15
2021-01-15 00:02:00,39031.96,39189.57,39023.24,39184.01,89.64397,2021-01-15
2021-01-15 00:03:00,39185.62,39366.44,39160.92,39360.01,71.681691,2021-01-15
#,orderId,exchange,datetime,date,ticker,side,type,notional,size,price,fee,feeCurrency
0,############,EXC,YYYY-MM-DD HH:MM:SS,M/DD/YYYY,BTC-PERP,buy,maker,#####.####,#.####,#####,##.########,USD
1,############,EXC,YYYY-MM-DD HH:MM:SS,M/DD/YYYY,BTC-PERP,buy,maker,#####.####,#.####,#####,##.########,USD
2,############,EXC,YYYY-MM-DD HH:MM:SS,M/DD/YYYY, BTC-PERP,sell,taker,#####.####,#.####,#####,##.########,USD
3,############,EXC,YYYY-MM-DD HH:MM:SS,M/DD/YYYY,BTC-PERP,sell, maker,#####.####,#.####,#####,##.########,USD

以下是我的代码。

merged_df = pd.DataFrame()
merged_df['combined_datetime'] = pd.concat([bitcoin_df['Opentime'], buys['datetime'], sells['datetime']]).reset_index(drop=True)

merged_df = merged_df.sort_values(by='combined_datetime').reset_index(drop=True)

merged_df['transaction_type'] = ''

** #The problem lies beyond here somewhere **
# Get the number of rows in the 'bitcoin_df' and 'buys' dataframes
bitcoin_rows = bitcoin_df.shape[0]
sell_rows = sells.shape[0]
buys_rows = buys.shape[0]

# Assign 'buy' to the 'transaction_type' column for 'buys' rows
merged_df.loc[:buys_rows - 1, 'transaction_type'] = 'buy'

# Assign 'sell' to the 'transaction_type' column for 'sells' rows
merged_df.loc[sell_rows:, 'transaction_type'] = 'sell'

merged_df.loc[bitcoin_rows:, 'transaction_type'] = ''
print(merged_df)

此为当前输出

combined_datetime transaction_type
0      2022-08-23 00:00:00              buy
1      2022-08-23 00:01:00              buy
2      2022-08-23 00:02:00              buy
3      2022-08-23 00:03:00              buy
4      2022-08-23 00:04:00              buy
...                    ...              ...
334190 2023-04-11 23:58:00                 
334191 2023-04-11 23:59:00                 
334192 2023-04-12 00:00:00                 
334193 2023-04-12 00:01:00                 
334194 2023-04-12 00:02:00                 

[334195 rows x 2 columns]

输出应该是这样的

combined_datetime transaction_type
0      2022-08-23 00:00:00              sell
1      2022-08-23 00:01:33              buy
2      2022-08-23 00:02:00              
3      2022-08-23 00:03:00              
4      2022-08-23 00:04:00              
...                    ...              ...
334190 2023-04-11 23:58:41              buy
334191 2023-04-11 23:59:00                 
334192 2023-04-12 00:00:10              sell
334193 2023-04-12 00:01:00                 
334194 2023-04-12 00:02:00              buy

[334195 rows x 2 columns]
4szc88ey

4szc88ey1#

您可以在concat期间使用assign来执行此操作。然后,您可以删除要在其中创建列的.loc[]

merged_df['combined_datetime'] = pd.concat([bitcoin_df['Opentime'],
   buys[['datetime']].assign(transacation_type='buy'),
   sells[['datetime']].assign(transacation_type='sell')]).reset_index(drop=True)

这将在concat之前为每个买入和卖出 Dataframe 创建正确的交易类型的列。
OP编辑:

merged_df['time'] = merged_df['Opentime'].combine_first(merged_df['datetime'])
    merged_df = merged_df.drop(['Opentime', 'datetime'], axis=1)
    merged_df = merged_df[['time', 'transaction_type']]
    print(merged_df)
qlzsbp2j

qlzsbp2j2#

你必须在买入/卖出的指定值之后进行排序。和pandas concat dataframe按顺序,如在输入中首先bitcoin_df,然后买入,然后卖出。所以修改后的代码检查它是否工作

merged_df = pd.DataFrame()
merged_df['combined_datetime'] = pd.concat([bitcoin_df['Opentime'], 
buys['datetime'], sells['datetime']]).reset_index(drop=True)

merged_df['transaction_type'] = ''

** #The problem lies beyond here somewhere **
# Get the number of rows in the 'bitcoin_df' and 'buys' dataframes
bitcoin_rows = bitcoin_df.shape[0]
sell_rows = sells.shape[0]
buys_rows = buys.shape[0]

merged_df.loc[:bitcoin_rows-1, 'transaction_type'] = ''

 # Assign 'buy' to the 'transaction_type' column for 'buys' rows
merged_df.loc[bitcoin_row:bitcoin_row+buys_rows-1, 'transaction_type'] = 'buy'

# Assign 'sell' to the 'transaction_type' column for 'sells' rows
merged_df.loc[bitcoin_row+buys_rows:, 'transaction_type'] = 'sell'

merged_df = 
merged_df.sort_values(by='combined_datetime').reset_index(drop=True)
print(merged_df)

相关问题