pandas 如何有效地从大型轨迹集中选择一些数据

czq61nw1 于 2023-11-15 发布在其他

关注(0)|答案(2)|浏览(181)

"形势"
我有一个 Dataframe ，其中包含一组（实际上是三组，但我们先从一组开始）坐标（X，Y）或轨迹（两列）。

import numpy as np
import pandas as pd
import plotly.graph_objects as go

# Create data for trajectories
n_points = 500
theta = np.linspace(0, 2 * np.pi, n_points)
data = pd.DataFrame({
    'X': [10 * np.cos(theta) for theta in np.linspace(0, 2 * np.pi, n_points)],
    'Y': [10 * np.sin(theta) for theta in np.linspace(0, 2 * np.pi, n_points)],
    'XL': [11 * np.cos(theta) for theta in np.linspace(0, 2 * np.pi, n_points)],
    'YL': [11 * np.sin(theta) for theta in np.linspace(0, 2 * np.pi, n_points)],
    'XR': [9 * np.cos(theta) for theta in np.linspace(0, 2 * np.pi, n_points)],
    'YR': [9 * np.sin(theta) for theta in np.linspace(0, 2 * np.pi, n_points)]
})

fig = go.Figure()

# Add traces for trajectories
fig.add_trace(go.Scatter(x=data['X'], y=data['Y'], mode='lines+markers', name='Circle 1', line=dict(color='blue')))
fig.add_trace(go.Scatter(x=data['XL'], y=data['YL'], mode='lines+markers', name='Circle 2', line=dict(color='red')))
fig.add_trace(go.Scatter(x=data['XR'], y=data['YR'], mode='lines+markers', name='Circle 3', line=dict(color='green')))

# Define the rectangle parameters
XC, YC = 7.07, 7.07  # Center of the rectangle
XD, YD = 1, -1  # Direction of the rectangle
width = 3
height = 4

# Calculate the corner points of the rectangle using trigonometric functions
theta = np.arctan2(YD, XD)
cos_theta = np.cos(theta)
sin_theta = np.sin(theta)

rectangle_x = [XC + 0.5 * width * cos_theta - 0.5 * height * sin_theta,
               XC - 0.5 * width * cos_theta - 0.5 * height * sin_theta,
               XC - 0.5 * width * cos_theta + 0.5 * height * sin_theta,
               XC + 0.5 * width * cos_theta + 0.5 * height * sin_theta,
               XC + 0.5 * width * cos_theta - 0.5 * height * sin_theta]

rectangle_y = [YC + 0.5 * width * sin_theta + 0.5 * height * cos_theta,
               YC - 0.5 * width * sin_theta + 0.5 * height * cos_theta,
               YC - 0.5 * width * sin_theta - 0.5 * height * cos_theta,
               YC + 0.5 * width * sin_theta - 0.5 * height * cos_theta,
               YC + 0.5 * width * sin_theta + 0.5 * height * cos_theta]

# Add a trace for the rectangle
fig.add_trace(go.Scatter(x=rectangle_x, y=rectangle_y, fill='toself',  name='Rectangle'))

# Customize layout
fig.update_layout(
    title='Trajectories and Rectangle Plot',
    xaxis_title='X-axis',
    yaxis_title='Y-axis',
    showlegend=True
)
# Set axis aspect ratio to ensure circles appear as circles
fig.update_xaxes(scaleanchor="y", scaleratio=1)
fig.update_yaxes(scaleanchor="x", scaleratio=1)

fig.show()

字符串
这给予我们

的数据
考虑到，在这个例子中，我把一个完美的圆形轨迹，但它可以是任何东西，只要它是连续的。
"我想要的"
如图所示，有一个矩形。这个矩形的中心在X，Y轨迹的一个点上。（XC，YC）。
我想要的是只得到包含在矩形中的数据（我不确定是什么格式，也许是一个子数据框？？）。（让我们画出来确认一下）
我意识到，我可以对所有数据进行比较，以检查它们是否在矩形内，但这里我有500个点，我希望在我认为的轨迹中有更多的点（50，000个点）。我认为检查所有这些点可能会花费太多时间
是否有一种方法可以有效地从原始数据中选择某个地理区域（矩形）？

（注后）* 在选择数据后，我想对此进行坐标转换，以便在坐标系中绘制，其中XC、YC为原点，矩形边平行于X、Y。但首先我需要数据

pandas

来源：https://stackoverflow.com/questions/77364235/how-to-efficiently-select-some-data-from-a-large-trajectory-set

2条答案

按热度按时间

6pp0gazn1#

一种可能的方法是使用平移将框架更改为（XC，YC），并对所有点应用theta旋转。所有操作都是矢量化的：

def isin(X, Y):
    X = X - XC
    Y = Y - YC
    X_ = X*cos_theta + Y*sin_theta
    Y_ = -X*sin_theta + Y*cos_theta
    return (-width/2 <= X_) & (X_ <= width/2) & (-height/2 <= Y_) & (Y_ <= height/2)

m1 = isin(data['X'], data['Y'])
m2 = isin(data['XL'], data['YL'])
m3 = isin(data['XR'], data['YR'])

字符串
输出量：

>>> m1.sum(), m2.sum(), m3.sum()
(24, 22, 26)

import matplotlib.pyplot as plt

plt.scatter(data.loc[m1, 'X'], data.loc[m1, 'Y'], c='b')
plt.scatter(data.loc[m2, 'XL'], data.loc[m2, 'YL'], c='r')
plt.scatter(data.loc[m3, 'XR'], data.loc[m3, 'YR'], c='g')
plt.plot(rectangle_x, rectangle_y, c='k')
plt.axis('equal')
plt.show()

的数据

赞(0）回复(0）举报 2023-11-15

jm81lzqq2#

我会先定义一个与原始坐标集[x_left，x1_right，y_top，y_bottom]对齐的边界框，然后过滤掉所有在此之外的数据点（使用pandas索引切片应该非常有效）。
然后你别无选择，只能检查剩下的点是否真的在真实的矩形内，那么最快的方法可能是旋转坐标集，使矩形与新的坐标集对齐，并再次使用索引切片来获得最终的坐标集，这应该会给你给予你想要的。

赞(0）回复(0）举报 2023-11-15

我来回答

pandas 如何有效地从大型轨迹集中选择一些数据

2条答案

相关问题

热门标签

最新问答