R语言 网页搜罗专业足球参考

tktrz96b  于 2022-12-25  发布在  其他
关注(0)|答案(1)|浏览(192)

我有兴趣在网页刮临足球参考。我需要设置一个函数,使我能够刮多个页面。到目前为止,我有代码,似乎是功能。然而,我不断得到一个错误...

scrapeData = function(urlprefix, urlend, startyr, endyr) {
  master = data.frame()
  for (i in startyr:endyr) {
    cat('Loading Year', i, '\n')
    URL = paste(urlprefix, as.character(i), urlend, sep = "")
    table = readHTMLTable(URL, stringsAsFactors = F)[[1]]
    table$Year = i
    master = rbind(table, master)
  }
  return(master)
}
drafts = scrapeData('http://www.pro-football-reference.com/years/', '/draft.htm', 2010, 2010)

当运行它时,返回值是--

Error: failed to load external entity "http://www.pro-football-reference.com/years/2010/draft.htm"

任何建议都很有用。谢谢。

kh212irz

kh212irz1#

library(tidyverse)
library(rvest)

get_football <- function(year) {
  str_c("https://www.pro-football-reference.com/years/",
        year,
        "/draft.htm") %>%
    read_html() %>%
    html_table() %>%
    pluck(1) %>%
    janitor::row_to_names(1) %>%
    janitor::clean_names() %>% 
    mutate(year = year)
}

map_dfr(2010:2015, get_football)

# A tibble: 1,564 × 30
   rnd   pick  tm    player pos   age   to    ap1   pb    st    w_av  dr_av g     cmp   att  
   <chr> <chr> <chr> <chr>  <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
 1 1     1     STL   Sam B… QB    22    2018  0     0     5     44    25    83    1855  2967 
 2 1     2     DET   Ndamu… DT    23    2022  3     5     12    100   59    196   0     0    
 3 1     3     TAM   Geral… DT    22    2021  1     6     10    69    65    140   0     0    
 4 1     4     WAS   Trent… T     22    2022  1     9     11    78    58    160   0     0    
 5 1     5     KAN   Eric … DB    21    2018  3     5     5     50    50    89    0     0    
 6 1     6     SEA   Russe… T     21    2020  0     2     9     56    31    131   0     0    
 7 1     7     CLE   Joe H… DB    21    2021  0     3     10    62    39    158   0     0    
 8 1     8     OAK   Rolan… LB    21    2015  0     0     5     25    15    65    0     0    
 9 1     9     BUF   C.J. … RB    23    2017  0     1     3     34    32    90    0     0    
10 1     10    JAX   Tyson… DT    23    2022  0     0     7     44    33    188   0     0    
# … with 1,554 more rows, and 15 more variables

相关问题