R语言 将管道操作符%>%与替换函数(如列名()〈-)一起使用

shyt4zoc  于 2023-01-06  发布在  其他
关注(0)|答案(4)|浏览(159)

如何使用管道操作符将替换函数(如colnames()<-)管道化?
这是我想做的:

library(dplyr)
averages_df <- 
   group_by(mtcars, cyl) %>%
   summarise(mean(disp), mean(hp))
colnames(averages_df) <- c("cyl", "disp_mean", "hp_mean")
averages_df

# Source: local data frame [3 x 3]
# 
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

但理想的情况是:

averages_df <- 
  group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  add_colnames(c("cyl", "disp_mean", "hp_mean"))

有没有一种方法可以做到这一点,而不必每次都编写一个专门的函数?
这里的答案是一个开始,但不完全是我的问题:Chaining arithmetic operators in dplyr

vdzxcuhz

vdzxcuhz1#

您可以使用colnames<-setNames(感谢@大卫Arenburg)

group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  `colnames<-`(c("cyl", "disp_mean", "hp_mean"))
  # or
  # `names<-`(c("cyl", "disp_mean", "hp_mean"))
  # setNames(., c("cyl", "disp_mean", "hp_mean")) 

#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

或者从magrittr中选择一个Aliasset_colnames):

library(magrittr)
group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  set_colnames(c("cyl", "disp_mean", "hp_mean"))

如果您只是(重新)命名许多列中的几列,dplyr::rename可能会更方便(它需要同时写入旧名称和新名称;参见“理查德·斯克里文的回答”)

mpgws1up

mpgws1up2#

dplyr中,有几种不同的方法可以重命名列。
一种是使用rename()函数,在这个例子中你需要反勾summarise()创建的名字,因为它们是表达式。

group_by(mtcars, cyl) %>%
    summarise(mean(disp), mean(hp)) %>%
    rename(disp_mean = `mean(disp)`, hp_mean = `mean(hp)`)
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

你也可以使用select(),这会更容易一些,因为我们可以使用列号,而不需要再用反勾号。

group_by(mtcars, cyl) %>%
    summarise(mean(disp), mean(hp)) %>%
    select(1, disp_mean = 2, hp_mean = 3)

但是对于这个例子,最好的方法是按照@thelatemail在评论中提到的那样做,那就是后退一步,用summarise()命名列。

group_by(mtcars, cyl) %>%
    summarise(disp_mean = mean(disp), hp_mean = mean(hp))
bvhaajcl

bvhaajcl3#

我们可以通过使用summarise_at.funs参数和dplyr为汇总变量添加后缀,如以下代码所示。

library(dplyr)

# summarise_at with dplyr
mtcars %>% 
  group_by(cyl) %>%
  summarise_at(
    .cols = c("disp", "hp"),
    .funs = c(mean="mean")
  )
# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429

此外,我们可以用几种方式设置列名。

# set_names with magrittr
mtcars %>% 
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  magrittr::set_names(c("cyl", "disp_mean", "hp_mean"))

# set_names with purrr
mtcars %>% 
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  purrr::set_names(c("cyl", "disp_mean", "hp_mean"))

# setNames with stats
mtcars %>%
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  stats::setNames(c("cyl", "disp_mean", "hp_mean"))

# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429
zour9fqk

zour9fqk4#

这也可以:

set <- function(fun) {
  match.fun(paste0(deparse(substitute(fun)), "<-"))
}

library(dplyr, w = F)
group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  set(colnames)(c("cyl", "disp_mean", "hp_mean"))
#> # A tibble: 3 × 3
#>     cyl disp_mean hp_mean
#>   <dbl>     <dbl>   <dbl>
#> 1     4      105.    82.6
#> 2     6      183.   122. 
#> 3     8      353.   209.

创建于2022年11月23日,使用reprex v2.0.2

相关问题