我创建了一个函数来编辑data.table a
中选定列的值。基于第二数据表b
的条目来选择列。
列的选择是通过将b$names
中的值与a
的列名进行匹配来完成的,每个匹配都会选择一列a
。
然后使用data.table包的paste0(cols,"") := lapply(.SD, compute, x2 = b[i,vals])
特性更新匹配的列。
下面是我使用的代码:
library(stringr)
library(data.table)
a <- data.table(A = seq(1, 10), mean_one = rnorm(10, 4, 5), mean_two = rnorm(10, 0, 4), mean_three = rnorm(10, 7, 1), sd_one = rnorm(10, 2, 1), sd_two = rnorm(10, 1,0.5), sd_three = rnorm(10, 5, 1))
b <- data.table(names = c("one", "two", "three"), vals = c(50, 40, 30))
# Display a before applying convert function
a
convert <- function(data1, data2){
# Internal copy of input data1 (introduced this line after first noticing problem but changes nothing)
output <- data1
# Define internal function to apply to columns
compute <- function(x1,x2){
x1/x2*100
}
#Loop to go through all ines in data2 and apply compute to them
for(i in 1:nrow(data2)){
# Identify columns that match of group entry
col_nums <- str_which(colnames(output), pattern = b[i,names])
cols <- colnames(output[,..col_nums])
# Update BD columns by applying get_BDpercent to group
output <- output[, paste0(cols, "") := lapply(.SD, compute, x2 = data2[i,vals]), .SDcols = cols]
}
output
}
c <- convert(a,b)
a
应用该函数创建一个新对象c
工作正常,但不知何故a
也被修改了:a
在应用函数之前:
A mean_one mean_two mean_three sd_one sd_two sd_three
1: 1 5.401716 2.3178453 7.861245 2.242470 1.38388160 5.973074
2: 2 6.297607 -5.3536197 8.129933 3.059719 0.82876696 3.888044
3: 3 -7.335733 2.7596386 6.925848 3.052197 0.83010216 4.467223
4: 4 13.102340 5.2097568 6.833323 1.695432 1.11115086 4.925649
5: 5 4.027482 -3.3134371 9.007320 2.478962 1.09693094 3.951823
6: 6 9.870239 2.7126299 6.162253 1.223240 1.55348307 5.051808
7: 7 7.820498 -0.8919731 4.783220 1.340520 1.47370370 3.268912
8: 8 -2.641131 -9.3359900 6.538014 1.947230 0.73272735 4.299064
9: 9 -1.128654 -2.1083705 8.170297 1.286314 0.07326959 5.110170
10: 10 -2.255471 -2.5927778 7.062032 3.714906 0.40010323 6.131127
应用函数后的a
(与c
相同,为所需输出):
A mean_one mean_two mean_three sd_one sd_two sd_three
1: 1 10.803432 5.794613 26.20415 4.484940 3.459704 19.91025
2: 2 12.595214 -13.384049 27.09978 6.119439 2.071917 12.96015
3: 3 -14.671467 6.899097 23.08616 6.104394 2.075255 14.89074
4: 4 26.204680 13.024392 22.77774 3.390865 2.777877 16.41883
5: 5 8.054964 -8.283593 30.02440 4.957924 2.742327 13.17274
6: 6 19.740477 6.781575 20.54084 2.446479 3.883708 16.83936
7: 7 15.640997 -2.229933 15.94407 2.681040 3.684259 10.89637
8: 8 -5.282262 -23.339975 21.79338 3.894460 1.831818 14.33021
9: 9 -2.257308 -5.270926 27.23432 2.572629 0.183174 17.03390
10: 10 -4.510942 -6.481945 23.54011 7.429812 1.000258 20.43709
虽然这不会是一个问题,如果我从来没有回到a
,这个功能是包括在一个闪亮的应用程序和取回这样的修改会搞砸它。
我是不是漏掉了什么?或者也许有一些我不明白的方式,这些列的更新。我的猜测是这个问题与data.table的工作方式有关。
任何帮助都很感激。谢谢!
1条答案
按热度按时间b5buobof1#
感谢Jamie指出使用
copy
来防止编辑a
。这解决了问题。
我仍然不清楚为什么在全局环境中在
a
上进行任何操作。我在函数中使用data.table的:=
特性时从未遇到过这个问题。