R语言如何计算100个不同csv文件中同一列(同名)的平均值，但文件名有一部分是相同的？

u7up0aaq 于 2023-01-15 发布在其他

关注(0)|答案(1)|浏览(126)

我有一堆csv文件，结构是这样的：

df <- data.frame (first_column  = c(3, 2, 6, 7),
                  second_column = c(7, 5, 1, 8))

所有csv文件的名称类似

"type1_1.csv"
"type1_2.csv"
...
"type2_1.csv"
"type2_2.csv"
...

每个csv都有first_column和second_column。我想要创建一个新的 Dataframe ，如下所示：

# name        meanofsecond_column
# type1_1     5.25
# ...

我已经开始做的是，分别写出每一个：

type1_1 <- read_csv("type1_1.csv")
type1_1mean <- mean(type1_1$second_column)
...
df <- data.frame (name  = c(type1_1, type1_2...),
                  meanofsecondcolumn = c(type1_1mean, type1_2mean...))

但是，由于有100多个csv文件，这种方法不是很高效，也不干净，我怎么才能让它更精简呢？

来源：https://stackoverflow.com/questions/75073324/how-to-calculate-the-average-of-the-same-column-with-same-name-in-100s-differe

1条答案

按热度按时间

flseospp1#

# path where your csv files are (here current working directory)
CSV_FOLDER <- "."

# list all csv files in given directory
# second parameter is a regex meaning ends with .csv
# third parameter make function return file names with path
csv_files <- list.files(CSV_FOLDER, "\\.csv$", full.names=TRUE)

# apply given function on each file and collect results in a list
res <- lapply(csv_files, function(csv_file) {
  # read current file
  tmp <- read_csv(csv_file)

  # build a data.frame from filename (without path) and mean of second column
  return(data.frame(
    name = basename(csv_file),
    meanofsecondcolumn = mean(tmp$second_column)
  ))
})

# rbind all single line data.frames in a single data.frame
res <- do.call("rbind", res)

赞(0）回复(0）举报 2023-01-15

我来回答

R语言如何计算100个不同csv文件中同一列(同名)的平均值，但文件名有一部分是相同的？

1条答案

相关问题

热门标签

最新问答

R语言 如何计算100个不同csv文件中同一列(同名)的平均值，但文件名有一部分是相同的？

1条答案

相关问题

热门标签

最新问答

R语言如何计算100个不同csv文件中同一列(同名)的平均值，但文件名有一部分是相同的？