R语言 API坐标请求循环疑难解答

dbf7pr2w  于 2023-01-18  发布在  其他
关注(0)|答案(1)|浏览(123)

我想找到地址列表的坐标。
我使用的数据集可以在这里找到:"示例网站"https://www.data.gv.at/katalog/dataset/kaufpreissammlung-liegenschaften-wien"
我使用read_csv函数作为"data"输入了这个,我使用tidyverse和jsonlite库,唯一相关的列是"Straße",这是街道名称,"ON"是街道编号,所有这些都是奥地利的维也纳。
我正在使用OpenStreetMap,并已按照格式要求对地址数据进行了格式化:

data$formatted_address <- paste(ifelse(is.na(data$ON), "", data$ON), "+", tolower(data$Straße), ",+vienna", sep = "")

这会将此列中的地址格式设置为1+milanweg,+vienna12+granergasse,+vienna。当我手动将其输入到API格式中时,一切都正常,我得到了坐标:https://nominatim.openstreetmap.org/search?q=1+milanweg,+vienna&format=json&polygon=1&addressdetails=1
由于我现在想对我的整个行都这样做,所以我使用jsonlite在R中创建请求。

data$coordinates <- data.frame(lat = NA, lon = NA)
for (i in 1:nrow(data)) {
  result <- try(readLines(paste0("https://nominatim.openstreetmap.org/search?q=", 
                                 URLencode(data$formatted_address[i]), "&format=json&polygon=1&addressdetails=1")), 
                silent = TRUE)
  if (!inherits(result, "try-error")) {
    if (length(result) > 0) {
      result <- fromJSON(result)
      if (length(result) > 0 && is.list(result[[1]])) {
        data$coordinates[i, ] <- c(result[[1]]$lat, result[[1]]$lon)
      }
    }
  }
}

理论上,这应该创建完全相同的API请求,但是,lat和lon列总是空的。
如何修复此脚本以创建数据集中每个地址的坐标列表?

xu3bshqb

xu3bshqb1#

数据设置

library(tidyverse) 
library(httr2) 

df <- df %>% 
  mutate(
    formatted_address = str_c(
      if_else(is.na(on), "", on), "+", str_to_lower(strasse), "+vienna"
    ) %>% str_remove_all(" ")
  )

# A tibble: 57,912 × 7
   kg_code katastralgemeinde      ez   plz strasse                   on      formatted_address                
     <dbl> <chr>               <dbl> <dbl> <chr>                     <chr>   <chr>                            
 1    1617 Strebersdorf         1417  1210 Mühlweg                   13      13+mühlweg+vienna                
 2    1607 Groß Jedlersdorf II   193  1210 Bahnsteggasse             4       4+bahnsteggasse+vienna           
 3    1209 Ober St.Veit         3570  1130 Jennerplatz               34/20   34/20+jennerplatz+vienna         
 4    1207 Lainz                 405  1130 Sebastian-Brunner-Gasse   6       6+sebastian-brunner-gasse+vienna 
 5    1101 Favoriten            3831  1100 Laxenburger Straße        2C -2 D 2C-2D+laxenburgerstraße+vienna   
 6    1101 Favoriten            3827  1100 Laxenburger Straße        2 C     2C+laxenburgerstraße+vienna      
 7    1101 Favoriten            3836  1100 hinter Laxenburger Straße 2 C     2C+hinterlaxenburgerstraße+vienna
 8    1201 Auhof                 932  1130 Keplingergasse            10      10+keplingergasse+vienna         
 9    1213 Speising              135  1130 Speisinger Straße         29      29+speisingerstraße+vienna       
10    1107 Simmering            2357  1100 BATTIGGASSE               44      44+battiggasse+vienna            
# … with 57,902 more rows
# ℹ Use `print(n = ...)` to see more rows

API调用和获取坐标。我收集了API匹配的显示名称,以及纬度数据。

get_coords <- function(address) {
  cat("Getting coordinates", address, "\n")
  str_c(
    "https://nominatim.openstreetmap.org/search?q=",
    address,
    "&format=json&polygon=1&addressdetails=1"
  ) %>%
    request() %>%
    req_perform() %>%
    resp_body_json(simplifyVector = TRUE) %>%
    as_tibble() %>%
    select(api_name = display_name,
           lat, lon) %>% 
    slice(1)
}

df %>% 
  slice_sample(n = 10) %>% 
  mutate(coordinates = map(
    formatted_address, possibly(get_coords, tibble(
      api_name = NA_character_, 
      lat = NA_character_, 
      lon = NA_character_
    ))
  )) %>% 
  unnest(coordinates) 

# A tibble: 10 × 10
   kg_code katastralgemeinde      ez   plz strasse               on    formatted_…¹ api_n…² lat   lon  
     <dbl> <chr>               <dbl> <dbl> <chr>                 <chr> <chr>        <chr>   <chr> <chr>
 1    1651 Aspern               3374  1220 ERLENWEG              8     8+erlenweg+… 8, Erl… 48.2… 16.4…
 2    1613 Leopoldau            6617  1210 Oswald-Redlich-Straße 31    31+oswald-r… 31, Os… 48.2… 16.4…
 3    1006 Landstraße           2425  1030 HAGENMÜLLERGASSE      45018 45018+hagen… Hagenm… 48.1… 16.4…
 4    1101 Favoriten             541  1100 HERNDLGASSE           7     7+herndlgas… 7, Her… 48.1… 16.3…
 5    1607 Groß Jedlersdorf II   221  1210 Prager Straße         70    70+pragerst… Prager… 48.2… 16.3…
 6    1006 Landstraße           1184  1030 PAULUSGASSE           2     2+paulusgas… 2, Pau… 48.1… 16.3…
 7    1654 Eßling               2712  1220 KAUDERSSTRASSE        61    61+kauderss… 61, Ka… 48.2… 16.5…
 8    1401 Dornbach             2476  1170 Alszeile              NA    +alszeile+v… Alszei… 48.2… 16.2…
 9    1654 Eßling                745  1220 Kirschenallee         19    19+kirschen… 19, Ki… 48.2… 16.5…
10    1204 Hadersdorf           3139  1140 MITTLERE STRASSE      NA    +mittlerest… Mittle… 48.2… 16.1…
# … with abbreviated variable names ¹​formatted_address, ²​api_name

相关问题