numpy Python -地理坐标之间的距离矩阵

monwx1rj  于 2023-02-08  发布在  Python
关注(0)|答案(3)|浏览(202)

我有一只拥有超过600个地理坐标点的数据框Pandas。下面是他的摘录:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from math import sin, cos, sqrt, atan2, radians

lat_long = pd.DataFrame({'LATITUDE':[-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})
lat_long

要手动计算两点之间的距离,我使用以下代码:

lat1 = radians(lat_long['LATITUDE'][0])
lon1 = radians(lat_long['LONGITUDE'][0])
lat2 = radians(lat_long['LATITUDE'][1])
lon2 = radians(lat_long['LONGITUDE'][1])

R = 6373.0

dlon = lon2 - lon1
dlat = lat2 - lat1

a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
c = 2 * atan2(sqrt(a), sqrt(1 - a))

distance = R * c

print("Result:", round(distance,4))

我需要做的是创建一个函数,使用上面的公式来计算所有点到所有点的距离,就像在数组中一样。但是我很难考虑用什么函数来存储点之间的距离。欢迎提供任何帮助。输出示例(如果我还不清楚,仅用于说明目的):

|       |point 0 | point1 | point2 |
|point0 |    0   |    2   |   3    |
|point1 |    2   |    0   |   4    |
|point2 |    3   |    4   |   0    |
        |distance|distance|distance|
ac1kyiln

ac1kyiln1#

可以使用pdist来计算两两之间的距离:

import pandas as pd

import numpy as np
from math import sin, cos, sqrt, atan2, radians

from scipy.spatial.distance import pdist, squareform

lat_long = pd.DataFrame({'LATITUDE': [-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})

def dist(x, y):
    """Function to compute the distance between two points x, y"""

    lat1 = radians(x[0])
    lon1 = radians(x[1])
    lat2 = radians(y[0])
    lon2 = radians(y[1])

    R = 6373.0

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat / 2) ** 2 + cos(lat1) * cos(lat2) * sin(dlon / 2) ** 2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    distance = R * c

    return round(distance, 4)

distances = pdist(lat_long.values, metric=dist)

points = [f'point_{i}' for i in range(1, len(lat_long) + 1)]

result = pd.DataFrame(squareform(distances), columns=points, index=points)

print(result)
    • 产出**
point_1  point_2  point_3  point_4  point_5
point_1   0.0000  20.5115   8.4123  15.3203  50.1784
point_2  20.5115   0.0000  16.3400  15.8341  30.0319
point_3   8.4123  16.3400   0.0000   6.9086  44.1838
point_4  15.3203  15.8341   6.9086   0.0000  40.0284
point_5  50.1784  30.0319  44.1838  40.0284   0.0000

注意,squareform将稀疏矩阵转换为密集矩阵,因此结果存储在numpy数组中。

v8wbuo2f

v8wbuo2f2#

另一个可能的解决方案是

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from math import sin, cos, sqrt, atan2, radians

lat_long = pd.DataFrame({'LATITUDE':[-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})
lat_long

test = lat_long.iloc[2:,:]

def distance(city1, city2):
    lat1 = radians(city1['LATITUDE'])
    lon1 = radians(city1['LONGITUDE'])
    lat2 = radians(city2['LATITUDE'])
    lon2 = radians(city2['LONGITUDE'])

    R = 6373.0

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    distance = R * c

    return distance

dist = np.zeros([lat_long.shape[0],lat_long.shape[0]])
for i1, city1 in lat_long.iterrows():
    for i2, city2 in lat_long.iloc[i1+1:,:].iterrows():
        dist[i1,i2] = distance(city1, city2)

print(dist)
    • 产出**
[[ 0.         20.51149047  8.41230771 15.32026132 50.17836849]
 [ 0.          0.         16.33997119 15.83407186 30.03192954]
 [ 0.          0.          0.          6.90864606 44.18376436]
 [ 0.          0.          0.          0.         40.02842872]
 [ 0.          0.          0.          0.          0.        ]]

距离矩阵的下三角是空的,因为该矩阵是对称的(dist[i1,i2]==dist[i2,i1]

4urapxun

4urapxun3#

使用GeoPandas

import pandas as pd
import geopandas as gpd

lat_long = pd.DataFrame({'LATITUDE':[-22.98, -22.97, -22.92, -22.87, -22.89], 'LONGITUDE': [-43.19, -43.39, -43.24, -43.28, -43.67]})

# Convert Pandas dataframe to GeoPandas dataframe
gdf = gpd.GeoDataFrame(
    lat_long,
    geometry=gpd.points_from_xy(lat_long['LONGITUDE'], lat_long['LATITUDE']),
    crs='EPSG:4326' # Or change to what's appropriate for you.
)

# Calculate distances between points
distances = []
for _, row in gdf.iterrows():
    distances.append(gdf['geometry'].distance(row['geometry'])*100)

# Create data frame of distances
distances_df = pd.DataFrame.from_records(distances)
print(distances_df)

输出:
| | 无|1个|第二章|三个|四个|
| - ------|- ------|- ------|- ------|- ------|- ------|
| 无|0.00万|二○ ○二四九八四|七、八十一万零二百五十|十四二十二万六百七十|四十八,八三六四六二|
| 1个|二○ ○二四九八四|0.00万|十五|十四,八十六万零六九|二九十二万零四四十|
| 第二章|七、八十一万零二百五十|十五|0.00万|六、四零三一二四|小行星43.104524|
| 三个|十四二十二万六百七十|十四,八十六万零六九|六、四零三一二四|0.00万|小行星39|
| 四个|四十八,八三六四六二|二九十二万零四四十|小行星43.104524|小行星39|0.00万|
请注意,由于坐标参考系(CRS)的原因,此输出可能与其他答案不同。查找适合您的CRS here

相关问题