如何在R中从列表中提取经度和纬度信息

czq61nw1  于 2023-07-31  发布在  其他
关注(0)|答案(2)|浏览(125)

我需要你的帮助从一个列表中提取经度和纬度信息。我有一堆具体的地址,我用这个网站来获取每个地址的经纬度,https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress。下面是我的代码:

fetch_geocodes <- function(address) {
  # Specify the API endpoint
  base_url <- "https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress"
  
  # Specify the parameters to pass to the API
  params <- list(
    address = address,
    benchmark = "Public_AR_Current",  
    vintage = "Current_Current",
    format = "json"
  )
  
  # Send a GET request to the API
  response <- GET(url = base_url, query = params)
  
  # Check if the request was successful
  if (status_code(response) == 200) {
    # Parse the response to JSON
    data <- content(response, "parsed")
    
    # Print the entire JSON response
    print(data)
    
    # Extract the longitude and latitude
    longitude <- data$result$addressMatches$coordinates$x
    latitude <- data$result$addressMatches$coordinates$y
    
    return(c(longitude, latitude))
  } else {
    stop("Request failed with status ", status_code(response))
  }
}
addresses <- c("Riverside Dr, Apple Valley, CA, 92307",
               "11 Wall Street, New York, NY 10005")
geocodes <- lapply(addresses, fetch_geocodes)

下面是我的部分输出,因为整个输出很长:

$result
$result$input
$result$input$address
$result$input$address$address
[1] "Riverside Dr, Apple Valley, CA, 92307"

$result$input$vintage
$result$input$vintage$isDefault
[1] TRUE

$result$input$vintage$id
[1] "4"

$result$input$vintage$vintageName
[1] "Current_Current"

$result$input$vintage$vintageDescription
[1] "Current Vintage - Current Benchmark"

$result$input$benchmark
$result$input$benchmark$isDefault
[1] TRUE

$result$input$benchmark$benchmarkDescription
[1] "Public Address Ranges - Current Benchmark"

$result$input$benchmark$id
[1] "4"

$result$input$benchmark$benchmarkName
[1] "Public_AR_Current"


$result$addressMatches
list()

$result
$result$input
$result$input$address
$result$input$address$address
[1] "11 Wall Street, New York, NY 10005"

$result$addressMatches[[1]]$coordinates
$result$addressMatches[[1]]$coordinates$x
[1] -74.01073

$result$addressMatches[[1]]$coordinates$y
[1] 40.70714


对于第一个地址,滨江Dr,Apple Valley,CA,92307,它没有从网站中提取经度和纬度,我需要将NA分配给“longitude”和“latitude”列。对于第二个地址,$result$addressMatches1$coordinates提供经度和纬度信息。但是,我不知道如何从geocodes中提取相应的信息,因为它返回NULL。

print(geocodes)
[[1]]
NULL

[[2]]
NULL


我不知道该怎么办。非常感谢你的帮助我的目标是得到一个有三列的 Dataframe ,第一列是full_address,第二列是longitude,第三列是latitude。

5m1hhzi4

5m1hhzi41#

前面:data$result$addressMatches是一个list,每个元素可能有coordinates,你可能会做类似data$result$addressMatches[[1]]$coordinates$x的事情。
如果保证返回中总是只有一个x/y,那么你可以这样做:

unlist(data$result$addressMatches[[1]]$coordinates)
#         x         y 
# -74.01073  40.70714

字符串
但是,如果你可以得到两个或更多,那么你需要返回一个listdata.frame,你需要做更多的工作:

L <- lapply(data$result$addressMatches, function(z) {
  if ("coordinates" %in% names(z)) unlist(z$coordinates) else c(x=NA_real_,y=NA_real_)
})
list(x=sapply(L, `[[`, 1), y=sapply(L, `[[`, 2))
# $x
# [1] -74.01073
# $y
# [1] 40.70714


使用第一个假设,那么

fetch_geocodes <- function(address) {
  # Specify the API endpoint
  base_url <- "https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress"
  
  # Specify the parameters to pass to the API
  params <- list(
    address = address,
    benchmark = "Public_AR_Current",  
    vintage = "Current_Current",
    format = "json"
  )
  
  # Send a GET request to the API
  response <- GET(url = base_url, query = params)
  
  # Check if the request was successful
  if (status_code(response) == 200) {
    # Parse the response to JSON
    data <- content(response, "parsed")
    
    ### Print the entire JSON response
    # print(data)
    
    # Extract the longitude and latitude
    if (length(data$result$addressMatches) > 0) {
      longitude <- data$result$addressMatches[[1]]$coordinates$x
      if (is.null(longitude)) longitude <- NA_real_
      latitude <- data$result$addressMatches[[1]]$coordinates$y
      if (is.null(latitude)) latitude <- NA_real_
    } else {
      longitude <- latitude <- NA_real_
    }
    
    return(c(longitude, latitude))
  } else {
    stop("Request failed with status ", status_code(response))
  }
}
lapply(addresses, fetch_geocodes)
# [[1]]
# [1] NA NA
# [[2]]
# [1] -74.01073  40.70714

tpgth1q7

tpgth1q72#

tidygeocoder包非常适合这一点。它支持多种地理编码服务,包括您正在使用的美国人口普查服务。

library(tidygeocoder)
addresses <- c("Riverside Dr, Apple Valley, CA, 92307",
               "11 Wall Street, New York, NY 10005")

adr_df <- data.frame(address = addresses)

字符串
默认情况下,tidycensus使用OSM地理编码器命名。它为您的两个示例地址查找坐标。

adr_df |>
  geocode(address = address)
#> Passing 2 addresses to the Nominatim single address geocoder
#> Query completed in: 2.8 seconds
#> # A tibble: 2 × 3
#>   address                                 lat   long
#>   <chr>                                 <dbl>  <dbl>
#> 1 Riverside Dr, Apple Valley, CA, 92307  34.5 -117. 
#> 2 11 Wall Street, New York, NY 10005     40.7  -74.0


尝试人口普查地理编码器,我们看到这里的第一个地址也没有产生任何坐标。

adr_df |>
  geocode(address = address,
          method = "census")
#> Passing 2 addresses to the US Census batch geocoder
#> Query completed in: 0.6 seconds
#> # A tibble: 2 × 3
#>   address                                 lat  long
#>   <chr>                                 <dbl> <dbl>
#> 1 Riverside Dr, Apple Valley, CA, 92307  NA    NA  
#> 2 11 Wall Street, New York, NY 10005     40.7 -74.0

相关问题