我尝试打开许多.dta文件(tax_1,tax_2,tax_3,... tax_800),然后在每个文件上执行任务(此任务不影响循环功能),并根据dta文件的名称将结果保存为.csv文件:
首先,我打开文件
setwd("")
list.tax <- read.dta13("tax_1.dta")
taxcode <- list.tax$ma_thue
然后,我在打开和保存文件之间执行一些任务,如下所示
url <- "https://example/example1"
link <- c(taxcode) %>% str_c(url, ., "/")
x<- map_dfr(link, scraper)
x.link <- x$link
info <- map(x.link,
\(url) {
html <- read_html(url)
info <- html_element(html, ".company-info .description") %>% html_text2()
# each cell in the 2-column table
html_elements(html, ".mt-20 .responsive-table-cell") %>%
html_text2() %>%
# as 2-coumn matrix, same shape and structure as table
matrix(ncol = 2, byrow = TRUE) %>%
# add infor row to matrix
rbind(c("info", info))
}) %>%
# m[,1] - description column
# m[,2] - value column
# make safe column names from 2nd column, apply those to values vector
# map_dfr turns each such list into tibble row retunrs single tribble
map_dfr(\(m) set_names(m[,2], make_clean_names(m[,1])))
最后,我将以stata文件的名称保存此任务结果:
write_excel_csv(data.frame(info), "tax_1.csv")
在do-files stata中,我可以循环序列{1,2,...,800}(forval i in 1/800})并打开list.tax <- read.dta13("tax_
i '. dta”)'这样的文件。我对R不是很熟悉,所以我真的很感激任何建议。
1条答案
按热度按时间h79rfbju1#
R的操作方式不同于Stata。
在R中为:
例如: