我有一堆分钟数据,如下所示:
index timestamp open high low ... trade_count vwap symbol cum_volume cum_volume_25k
0 2021-11-02 13:30:00+00:00 2021-11-02 13:30:00+00:00 12.000 12.055 12.00 ... 62 12.003693 AMCR 47208 0
1 2021-11-02 13:31:00+00:00 2021-11-02 13:31:00+00:00 12.040 12.070 12.04 ... 117 12.055675 AMCR 67656 0
2 2021-11-02 13:32:00+00:00 2021-11-02 13:32:00+00:00 12.050 12.050 12.03 ... 64 12.040067 AMCR 74411 0
3 2021-11-02 13:33:00+00:00 2021-11-02 13:33:00+00:00 12.035 12.035 12.01 ... 95 12.021537 AMCR 83597 0
4 2021-11-02 13:34:00+00:00 2021-11-02 13:34:00+00:00 12.035 12.035 12.02 ... 16 12.028696 AMCR 84855 0
... ... ... ... ... ... ... ... ... ... ... ...
429000 2022-10-21 19:59:00+00:00 2022-10-21 19:59:00+00:00 29.850 29.940 29.85 ... 298 29.894083 YETI 688656 0
429001 2022-10-21 20:00:00+00:00 2022-10-21 20:00:00+00:00 29.890 29.890 29.89 ... 15 29.890000 YETI 776882 0
429002 2022-10-21 20:08:00+00:00 2022-10-21 20:08:00+00:00 29.890 29.890 29.89 ... 1 29.890000 YETI 781148 0
429003 2022-10-21 22:06:00+00:00 2022-10-21 22:06:00+00:00 30.060 30.060 30.06 ... 3 30.060000 YETI 782348 0
429004 2022-10-21 22:44:00+00:00 2022-10-21 22:44:00+00:00 30.060 30.060 30.06 ... 1 30.060000 YETI 783148 0
[429005 rows x 12 columns]
然后我摸索着一天又一天的时间:
df['timestamp'] = pd.to_datetime(df['timestamp'])
days = df.groupby(df['timestamp'].dt.normalize().unique())
然后我一天一天地循环:
list_of_df = []
for index, name in days:
list_of_df.append(name)
final = pd.concat(list_of_df)
print(final[['symbol', 'timestamp']])
它返回:
symbol timestamp
95 AMCR 2021-11-02 15:05:00+00:00
186 AMCR 2021-11-02 16:37:00+00:00
187 AMCR 2021-11-02 16:38:00+00:00
0 AMCR 2021-11-02 13:30:00+00:00
212 AMCR 2021-11-02 17:03:00+00:00
.. ... ...
210 AMCR 2021-11-02 17:01:00+00:00
211 AMCR 2021-11-02 17:02:00+00:00
138 AMCR 2021-11-02 15:49:00+00:00
192 AMCR 2021-11-02 16:43:00+00:00
58 AMCR 2021-11-02 14:28:00+00:00
[230 rows x 2 columns]
我怎么才能让它只返回唯一的日子呢?E.g
symbol timestamp
95 AMCR 2021-11-02 00:00:00+00:00
97 APPL 2021-11-02 00:00:00+00:00
100 APPL 2021-11-03 00:00:00+00:00
102 APPL 2021-11-06 00:00:00+00:00
[230 rows x 2 columns]
1条答案
按热度按时间tv6aics11#
给定(我更改了时间戳以获得更多变化):
假设您希望每个符号都有日期,请执行以下操作:
产出: