R语言返回403禁止的错误

7cwmlq89 于 2023-01-03 发布在其他

关注(0)|答案(1)|浏览(289)

下面的代码以前是可以工作的，但是我尝试下载文件的网站增加了一个用户验证步骤。我试过一些方法，包括让代码在循环步骤中休眠，但是到目前为止没有任何效果。有什么建议吗？

library(tidyverse)
library(rvest)

page <-
 "https://burnsville.civicweb.net/filepro/documents/25657/" %>%
  read_html

df <- tibble(
  names1 = page %>%
    html_nodes(".document-link") %>%
    html_text2() %>%
    str_remove_all("\r") %>%
    str_squish(),
  links = page %>%
    html_nodes(".document-link") %>%
    html_attr("href") %>%
    paste0("https://burnsville.civicweb.net", .)
)

destfile<-("destination.pdf")

df %>% 
  map(~ download.file(df$links, destfile = paste0(df$names1, ".pdf")))

#loop through and download PDFs
for (i in df$links) {
  tryCatch({
    download.file(url,
                  basename(url),
                  mode = "wb",
                  quiet=TRUE)
  }, error = function(e){})
}

先谢了!

来源：https://stackoverflow.com/questions/74976969/r-download-file-returning-403-forbidden-error

1条答案

按热度按时间

nnt7mjpx1#

library(tidyverse)
library(rvest)

page <-
  "https://burnsville.civicweb.net/filepro/documents/25657/" %>%
  read_html

docs <- tibble(
  names = page %>%
    html_nodes(".document-link") %>%
    html_text2() %>%
    str_remove_all("\r") %>%
    str_squish(),
  links = page %>%
    html_nodes(".document-link") %>%
    html_attr("href") %>%
    paste0("https://burnsville.civicweb.net", .), 
  file = str_extract(links, "[^/]*$")
)

map2(docs$links, docs$file, ~ download.file(url = .x, 
                                            destfile = str_c(.y, ".pdf"), 
                                            mode = "wb"))

赞(0）回复(0）举报 2023-01-03

我来回答

R语言返回403禁止的错误

1条答案

相关问题

热门标签

最新问答

R语言 返回403禁止的错误

1条答案

相关问题

热门标签

最新问答

R语言返回403禁止的错误