R语言 计算一组列的行加权和

wnrlj8wa  于 2023-04-09  发布在  其他
关注(0)|答案(2)|浏览(238)

我有,比如说,以下 Dataframe :

> library(tidyverse)
> dd <- tibble(a = rep(1,10), b = rep(1,10), c = rep(1,10))
> dd
# A tibble: 10 × 3
       a     b     c
   <dbl> <dbl> <dbl>
 1     1     1     1
 2     1     1     1
 3     1     1     1
 4     1     1     1
 5     1     1     1
 6     1     1     1
 7     1     1     1
 8     1     1     1
 9     1     1     1
10     1     1     1

以及权重向量:

> weight <- c(1, 5, 10)
> weight
[1]  1  5 10

当我想计算dataframe的所有列的行加权和时,我这样做:

> dd %>% mutate(m = rowSums(map2_dfc(dd, weight,`*`)))
# A tibble: 10 × 4
       a     b     c     m
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     1    16
 2     1     1     1    16
 3     1     1     1    16
 4     1     1     1    16
 5     1     1     1    16
 6     1     1     1    16
 7     1     1     1    16
 8     1     1     1    16
 9     1     1     1    16
10     1     1     1    16

但我不知道如何计算 Dataframe 的子集的行加权和。我尝试了下面的代码,但它给出了混乱的结果:

> dd %>% rowwise() %>% mutate(m = rowwise(map2_dfc(c_across(b:c), weight[2:3],`*`)))
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
# A tibble: 10 × 4
# Rowwise: 
       a     b     c m$...1 $...2
   <dbl> <dbl> <dbl>  <dbl> <dbl>
 1     1     1     1      5    10
 2     1     1     1      5    10
 3     1     1     1      5    10
 4     1     1     1      5    10
 5     1     1     1      5    10
 6     1     1     1      5    10
 7     1     1     1      5    10
 8     1     1     1      5    10
 9     1     1     1      5    10
10     1     1     1      5    10

有人能给予我一个提示,如何处理这个问题吗?

mnemlml8

mnemlml81#

这是矩阵乘法。您的原始值相当于as.matrix(dd) %*% weight。对于mutate中的子集,您可以这样做:

dd %>% mutate(m = (across(b:c) %>% as.matrix()) %*% weight[1:2])
p8ekf7hl

p8ekf7hl2#

使用tidyverse方法,我们可以为'weight'创建一个命名向量,循环across列'B'到'c',根据列名(cur_column())子集'weight'值,相乘并得到rowSums

library(dplyr)
names(weight) <- names(dd)
dd %>% 
   mutate(m = rowSums(across(b:c,  ~ .x * weight[cur_column()])))
  • 输出
# A tibble: 10 × 4
       a     b     c     m
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     1    15
 2     1     1     1    15
 3     1     1     1    15
 4     1     1     1    15
 5     1     1     1    15
 6     1     1     1    15
 7     1     1     1    15
 8     1     1     1    15
 9     1     1     1    15
10     1     1     1    15

或者如果我们想使用rowwise(不推荐,因为它更慢)

dd %>% 
  rowwise %>%
  mutate(m = sum(c_across(b:c) * weight[2:3])) %>%
  ungroup

或者使用crossprod

dd %>%
   mutate(m = crossprod(t(pick(b:c)), weight[2:3])[,1])

base R

dd$m <-  rowSums(dd[2:3] * weight[2:3][col(dd[2:3])])

相关问题