阅读文本作为一个嵌入到R

bpzcxfmw 于 2023-11-14 发布在其他

关注(0)|答案(3)|浏览(82)

在下面的DATA中，read.table()要求study列的元素之间没有任何白色空格。例如，如果元素是"Hayati & Jalilifar"，则read.table()抛出错误，直到用户删除像"Hayati&Jalilifar"这样的白色空格。
但是，有没有一种方法可以让read.table()读取下面的DATA，而不需要删除任何数据元素之间的白色空格呢？

DATA = read.table(header=TRUE, text = 
 "study               year  g     v_g    assign_type  n_class Nt    Nc
  Hayati & Jalilifar  2009  0.213 0.101  student      NA      20    20
  Hayati & Jalilifar  2009  0.785 0.108  student      NA      20    20
  Hale & Courtney     1994 -0.894 0.0154 class        4       286   286
  Hale & Courtney     1994  0.946 0.0156 class        4       286   286
  Hale & Courtney     1994 -0.237 0.0146 class        4       277   277
  Hale & Courtney     1994 -0.179 0.0146 class        4       277   277")

字符串

来源：https://stackoverflow.com/questions/77462272/reading-text-as-a-dataframe-into-r

3条答案

按热度按时间

jdzmm42g1#

将文本保存在变量中，并使用以下命令：

read.table(text=gsub("(\\S+\\s+[&]\\s+\\S+)", "'\\1'", txt), header = TRUE)
               study year      g    v_g assign_type n_class  Nt  Nc
1 Hayati & Jalilifar 2009  0.213 0.1010     student      NA  20  20
2 Hayati & Jalilifar 2009  0.785 0.1080     student      NA  20  20
3    Hale & Courtney 1994 -0.894 0.0154       class       4 286 286
4    Hale & Courtney 1994  0.946 0.0156       class       4 286 286
5    Hale & Courtney 1994 -0.237 0.0146       class       4 277 277
6    Hale & Courtney 1994 -0.179 0.0146       class       4 277 277

txt <- "study               year  g     v_g    assign_type  n_class Nt    Nc
  Hayati & Jalilifar  2009  0.213 0.101  student      NA      20    20
  Hayati & Jalilifar  2009  0.785 0.108  student      NA      20    20
  Hale & Courtney     1994 -0.894 0.0154 class        4       286   286
  Hale & Courtney     1994  0.946 0.0156 class        4       286   286
  Hale & Courtney     1994 -0.237 0.0146 class        4       277   277
  Hale & Courtney     1994 -0.179 0.0146 class        4       277   277"

赞(0）回复(0）举报 2023-11-14

polhcujo2#

看起来你的数据是固定宽度格式的。基于你的示例数据，这里是一种使用readr::read_fwf的方法，除了将第一列分成两列之外，它几乎可以完美地工作。此外，它需要第二步来获取列名：

library(readr)
library(dplyr, warn = FALSE)

tmp <- tempfile()

writeLines(
  text = "study               year  g     v_g    assign_type  n_class Nt    Nc
  Hayati & Jalilifar  2009  0.213 0.101  student      NA      20    20
  Hayati & Jalilifar  2009  0.785 0.108  student      NA      20    20
  Hale & Courtney     1994 -0.894 0.0154 class        4       286   286
  Hale & Courtney     1994  0.946 0.0156 class        4       286   286
  Hale & Courtney     1994 -0.237 0.0146 class        4       277   277
  Hale & Courtney     1994 -0.179 0.0146 class        4       277   277",
  tmp
)

dat <- readr::read_fwf(
  file = tmp, skip = 1
) |>
  mutate(X1 = paste(X1, X2), .keep = "unused")

names(dat) <- readr::read_table(tmp, n_max = 0) |> names()

dat
#> # A tibble: 6 × 8
#>   study               year      g    v_g assign_type n_class    Nt    Nc
#>   <chr>              <dbl>  <dbl>  <dbl> <chr>         <dbl> <dbl> <dbl>
#> 1 Hayati & Jalilifar  2009  0.213 0.101  student          NA    20    20
#> 2 Hayati & Jalilifar  2009  0.785 0.108  student          NA    20    20
#> 3 Hale & Courtney     1994 -0.894 0.0154 class             4   286   286
#> 4 Hale & Courtney     1994  0.946 0.0156 class             4   286   286
#> 5 Hale & Courtney     1994 -0.237 0.0146 class             4   277   277
#> 6 Hale & Courtney     1994 -0.179 0.0146 class             4   277   277

字符串

赞(0）回复(0）举报 2023-11-14

rkue9o1l3#

在base-R中，我有时会用途：

txt <- "study               year  g     v_g    assign_type  n_class Nt    Nc
  Hayati & Jalilifar  2009  0.213 0.101  student      NA      20    20
  Hayati & Jalilifar  2009  0.785 0.108  student      NA      20    20
  Hale & Courtney     1994 -0.894 0.0154 class        4       286   286
  Hale & Courtney     1994  0.946 0.0156 class        4       286   286
  Hale & Courtney     1994 -0.237 0.0146 class        4       277   277
  Hale & Courtney     1994 -0.179 0.0146 class        4       277   277"

data <- read.table(text = txt, header = FALSE, skip = 1L)
data$V1 <- with(data, paste(V1, V2, V3))
data[, c("V2", "V3")] <- list(NULL)
colnames(data) <- read.table(text = txt, nrows = 1L)

字符串
给

> head(data)
               study year      g    v_g assign_type n_class  Nt  Nc
1 Hayati & Jalilifar 2009  0.213 0.1010     student      NA  20  20
2 Hayati & Jalilifar 2009  0.785 0.1080     student      NA  20  20
3    Hale & Courtney 1994 -0.894 0.0154       class       4 286 286
4    Hale & Courtney 1994  0.946 0.0156       class       4 286 286
5    Hale & Courtney 1994 -0.237 0.0146       class       4 277 277
6    Hale & Courtney 1994 -0.179 0.0146       class       4 277 277

型

赞(0）回复(0）举报 2023-11-14

我来回答

阅读文本作为一个嵌入到R

3条答案

相关问题

热门标签

最新问答