R data.table向量化列乘法

x6492ojm  于 2022-12-06  发布在  其他
关注(0)|答案(3)|浏览(156)

假设我有一个这样的data.表(假设它有许多列,如“a1,...,a100,...”和类似的“b1,...,b100,...”)

dt <- data.table(id = 1:5, a1 = runif(5), a2 = runif(5), b1 = runif(5), b2 = runif(5))

因此输出如下所示:

id         a1         a2         b1         b2
1:  1 0.94431156 0.34668771 0.54899478 0.91512664
2:  2 0.32730005 0.87924651 0.88777763 0.90167832
3:  3 0.07438915 0.53728539 0.21463741 0.11291512
4:  4 0.23025893 0.08528074 0.68454936 0.45441690
5:  5 0.86105462 0.49976703 0.07362091 0.08834252

我想创建新列c1, c2,这样我就可以有效地

dt[, c('c1', 'c2') := .(a1*b1, a2*b2)]

带输出

id         a1         a2         b1         b2         c1         c2
1:  1 0.94431156 0.34668771 0.54899478 0.91512664 0.51842212 0.31726316
2:  2 0.32730005 0.87924651 0.88777763 0.90167832 0.29056966 0.79279752
3:  3 0.07438915 0.53728539 0.21463741 0.11291512 0.01596669 0.06066765
4:  4 0.23025893 0.08528074 0.68454936 0.45441690 0.15762361 0.03875301
5:  5 0.86105462 0.49976703 0.07362091 0.08834252 0.06339162 0.04415068

如何在不使用慢速循环的情况下实现这一点?

hmmo2u0o

hmmo2u0o1#

因为矩阵乘法在R中是元素级的,所以可以执行以下操作:

dt <- data.frame(id = 1:5, a1 = runif(5), a2 = runif(5), b1 = runif(5), b2 = runif(5))
a = dt[,2:3]
b = dt[,4:5]
c = a * b
names(c) = c("c1", "c2")
cbind(dt, c)

输出量

id          a1        a2        b1        b2          c1         c2
1  1 0.082863389 0.7108292 0.8952547 0.4530363 0.074183837 0.32203140
2  2 0.125423227 0.8957771 0.2231827 0.1042432 0.027992292 0.09337865
3  3 0.278592590 0.9317453 0.7910442 0.3729406 0.220379066 0.34748565
4  4 0.004518196 0.3890797 0.5323291 0.7997701 0.002405167 0.31117430
5  5 0.784290484 0.5499781 0.7429104 0.8106772 0.582657582 0.44585471
wr98u20j

wr98u20j2#

考虑到@GoldenGateBridge的回答,假设你不知道你的table可以有多大,你可以用下面的形式概括:

set.seed(10)
dt <- data.frame(id = 1:5, 
                 a1 = runif(5),
                 a2 = runif(5),
                 a3 = runif(5),
                 a4 = runif(5),
                 a5 = runif(5),
                 # ....
                 b1 = runif(5), 
                 b2 = runif(5),
                 b3 = runif(5),
                 b4 = runif(5),
                 b5 = runif(5)
                 # ....
)

n <- 1:((ncol(x)-1)/2)
ind_a <- paste0("a", n)
ind_b <- paste0("b", n)
ind_c <- paste0("c", n)

producto <- dt[ind_a]*dt[ind_b]
names(producto) <- ind_c
dt <- cbind(dt, producto)

我希望它有用。

zhte4eai

zhte4eai3#

我们可以尝试split.defult按列名拆分dt,然后依次执行元素乘法

dt[, c(
  dt,
  lapply(
    split.default(.SD, paste0("c", gsub("\\D+", "", names(.SD)))),
    function(v) do.call(`*`, v)
  )
),
.SDcols = patterns("\\d$")
]

相关问题