我有一个Pandas Dataframe 结构如下:
TimeStamp
2022-12-30 10:31:58.483700+00:00 1 FixType 4 4.000000e+00
2022-12-30 10:31:58.483700+00:00 1 Satellites 11 1.100000e+01
2022-12-30 10:31:58.484150+00:00 2 TimeConfirmed 0 0.000000e+00
2022-12-30 10:31:58.484150+00:00 2 Epoch 63797521999 1.641638e+09
2022-12-30 10:31:58.484150+00:00 2 TimeValid 1 1.000000e+00
... ... ... ... ...
2022-12-30 10:54:32.714050+00:00 9 AngularRateZ 1020 -1.000000e+00
2022-12-30 10:54:32.714050+00:00 9 AccelerationY 513 1.250000e-01
2022-12-30 10:54:32.714050+00:00 9 AccelerationZ 594 1.025000e+01
2022-12-30 10:54:32.714050+00:00 9 AngularRateX 1025 2.500000e-01
2022-12-30 10:54:32.714050+00:00 9 ImuValid 1 1.000000e+00
[973528 rows x 4 columns]
我需要将其放入以下结构中,同时将其重采样到特定频率(例如1S
):
FixType Satellites ... AngularRateZ ImuValid
TimeStamp ...
2022-12-30 10:31:59+00:00 4.0 11.0 ... NaN NaN
2022-12-30 10:32:00+00:00 4.0 11.0 ... -1.00 1.0
2022-12-30 10:32:01+00:00 4.0 12.0 ... -1.00 1.0
2022-12-30 10:32:02+00:00 4.0 12.0 ... -1.00 1.0
2022-12-30 10:32:03+00:00 4.0 12.0 ... -1.00 1.0
... ... ... ... ... ...
2022-12-30 10:54:28+00:00 4.0 13.0 ... -1.00 1.0
2022-12-30 10:54:29+00:00 4.0 14.0 ... -1.00 1.0
2022-12-30 10:54:30+00:00 4.0 14.0 ... -0.75 1.0
2022-12-30 10:54:31+00:00 4.0 14.0 ... -1.00 1.0
2022-12-30 10:54:32+00:00 4.0 14.0 ... -1.00 1.0
[1354 rows x 39 columns]
目前我通过以下代码实现这一点:
def restructure_data(df_phys, res):
import pandas as pd
df_phys_join = pd.DataFrame({"TimeStamp": []})
if not df_phys.empty:
for message, df_phys_message in df_phys.groupby("CAN ID"):
for signal, data in df_phys_message.groupby("Signal"):
col_name = signal
df_phys_join = pd.merge_ordered(
df_phys_join,
data["Physical Value"].rename(col_name).resample(res).ffill().dropna(),
on="TimeStamp",
fill_method="none",
).set_index("TimeStamp")
return df_phys_join
这是可行的,但似乎效率低下。我想知道是否有更聪明,也许更像Python的方法来达到类似的结果?
1条答案
按热度按时间sd2nnvve1#
如果没有更多的数据/信息(是否有任何重复的时间/新列组合需要处理?),很难说这是否可行,但是您可以使用一行代码:
在原始
df
中为values=
和columns=
填写正确的列名。