Apache Spark Databricks地理空间(纬度/经度)图

fruv7luv  于 2022-12-23  发布在  Apache
关注(0)|答案(1)|浏览(151)

是否可以在 Azure Databricks Notebook 中的世界Map上可视化(纬度、经度)数据?散点图是我得到的最接近的图。documentation说明了Map (Markers)的存在,但我没有找到这样的图。其他可能性也很受欢迎。
以下是Databricks doc中加载一些地理空间数据的示例:

%sql
CREATE TABLE IF NOT EXISTS default.sf_fire_incidents
USING com.databricks.spark.csv
OPTIONS (PATH 'dbfs:/databricks-datasets/learning-spark-v2/sf-fire/sf-fire-incidents.csv',
         HEADER "true",
         INFERSCHEMA "true");

SELECT location,
   `Estimated Property Loss`,
   coordinates[0]  AS LAT,
   coordinates[1] AS LONG
 FROM (SELECT location,
        `Estimated Property Loss`,
         split(btrim(location, '() '), ', ') AS coordinates
       FROM default.sf_fire_incidents
       ORDER BY `Estimated Property Loss` DESC,
                location)
 LIMIT 2000;

散点图(LAT, LONG)是我最接近的,但我无法在世界Map上显示位置。谢谢!

zujrkrfu

zujrkrfu1#

我认为最好的选择是使用python库folium。
链接到文档:https://python-visualization.github.io/folium/quickstart.html#Markers
您将从初始化Map对象开始。

import folium
# location refers to where the map will be centered at the begining
m = folium.Map(location=[45.372, -121.6972], zoom_start=12, tiles="Stamen Terrain")

然后将您的查询结果转换为Pandas Dataframe ,您可以执行以下操作:

YOUR_QUERY = """
SELECT location,
   `Estimated Property Loss`,
   coordinates[0]  AS LAT,
   coordinates[1] AS LONG
 FROM (SELECT location,
        `Estimated Property Loss`,
         split(btrim(location, '() '), ', ') AS coordinates
       FROM default.sf_fire_incidents
       ORDER BY `Estimated Property Loss` DESC,
                location)
 LIMIT 2000;
"""
pdf = spark.sql(YOUR_QUERY).toPandas()

最后,遍历行并将标记添加到map对象,如下所示:

for index, lines in pdf.iterrows():
 folium.Marker([lines[LAT], lines[LONG]], popup=f"{index}").add_to(m)

不要忘记在结束时通过调用对象来显示Map:

m

相关问题