R -在对数据框使用lapply时，引用列名而不是列索引

vlf7wbxs 于 2023-03-27 发布在其他

关注(0)|答案(2)|浏览(94)

我正在使用lapply从 Dataframe 的特定列中获取值，并将它们从1-5的范围更改为倒数（即1变为5，2变为4）。我通过引用列索引成功地做到了这一点：

df_vars[,c(104:183, 222:249, 271:290)] <- lapply(df_vars[,c(104:183, 222:249, 271:290)],
                                                 FUN = function(x) misty::item.reverse(x, min = 1, max = 5))

我希望能够做同样的事情，但使用列名代替。我不能通过引用所有数值列或范围从1到5的列来做到这一点，因为并非所有1-5刻度的列都需要反转。我也可能需要删除列，然后重新运行此代码，所以我想引用列名代替。
我尝试使用grep来获取列索引，代码如下：
使用一些示例数据：

# create example data frame
df <- data.frame("A" = c(1, 3, 5),
                 "B" = c(1, 2, 3),
                 "C" = c(4, 2, 1),
                 "D" = c(3, 2, 5),
                 "E" = c(5, 5, 4),
                 "F" = c(1, 2, 1),
                 "G" = c(3, 4, 3),
                 "H" = c(4, 3, 2))

# for this example, only A B D F G H need to be inverted

这是一个很小的数据框，但我的数据框要大得多，有100多列要反转，因此假设示例数据集太大，无法一次处理一列。
使用示例数据和指定的列进行反转，所需的输出将是以下 Dataframe ：

# transformed data frame
df <- data.frame("A" = c(5, 3, 1),
                 "B" = c(5, 4, 3),
                 "C" = c(4, 2, 1),
                 "D" = c(3, 4, 1),
                 "E" = c(5, 5, 4),
                 "F" = c(5, 4, 5),
                 "G" = c(3, 2, 3),
                 "H" = c(2, 3, 4))

我尝试使用grep来使用列名获取列索引。基于示例数据，我尝试的代码是：

df[, colnames(select(df, "A":"B", "D", "F":"H"))] <- lapply(grep(colnames(select(df, c("A":"B", "D", "F":"H"))), df),
                                                            FUN = function(x) misty::item.reverse(x, min = 1, max = 5))

这不起作用。单独测试grep函数得到了以下结果：

> grep(colnames(select(df, c("A":"B", "D", "F":"H"))), df)
integer(0)
Warning message:
In grep(colnames(select(df, c("A":"B", "D", "F":"H"))), df) :
  argument 'pattern' has length > 1 and only the first element will be used
>

有什么主意吗？谢谢。

r

来源：https://stackoverflow.com/questions/73036060/r-refer-to-column-names-rather-than-column-index-when-using-lapply-with-data-f

2条答案

按热度按时间

fhg3lkii1#

基于dplyr的可能解决方案：

library(dplyr)

df %>% 
  mutate(across(A:H, ~ (5:1)[.x]))

#>   A B C D E F G H
#> 1 5 5 2 3 1 5 3 2
#> 2 3 4 4 4 1 4 2 3
#> 3 1 3 5 1 2 5 3 4

赞(0）回复(0）举报 2023-03-27

e3bfsja22#

您可以按如下方式使用sapply()。本例中的问题是，您无法轻松地按名称设置列的范围。

cols <- c("A", "B", "D", "F", "G", "H")

df[,cols] <- sapply(df[,cols], \(x) (5:1)[x])

按列范围选择的最简单方法是使用eval_select()按数字返回它们的位置。但是如果你这样做，你也可以使用dplyr解决方案。这本质上是一个引擎盖下的外观。

library(tidyselect)

col_pos <- eval_select(expr(c(A:B, D, F:H)), df)

df[,col_pos] <- sapply(df[,col_pos], \(x) (5:1)[x])

赞(0）回复(0）举报 2023-03-27

我来回答

R -在对数据框使用lapply时，引用列名而不是列索引

2条答案

相关问题

热门标签

最新问答