R语言函数使用paste0cols和lapply特性重写输入？

fdx2calv 于 2023-06-19 发布在其他

关注(0)|答案(1)|浏览(117)

我创建了一个函数来编辑data.table a中选定列的值。基于第二数据表b的条目来选择列。
列的选择是通过将b$names中的值与a的列名进行匹配来完成的，每个匹配都会选择一列a。
然后使用data.table包的paste0(cols,"") := lapply(.SD, compute, x2 = b[i,vals])特性更新匹配的列。
下面是我使用的代码：

library(stringr)
library(data.table)

a <- data.table(A = seq(1, 10), mean_one = rnorm(10, 4, 5), mean_two = rnorm(10, 0, 4), mean_three = rnorm(10, 7, 1), sd_one = rnorm(10, 2, 1), sd_two = rnorm(10, 1,0.5), sd_three = rnorm(10, 5, 1))
b <- data.table(names = c("one", "two", "three"), vals = c(50, 40, 30))

# Display a before applying convert function
a

convert <- function(data1, data2){
  # Internal copy of input data1 (introduced this line after first noticing problem but changes nothing)
  output <- data1
  
  # Define internal function to apply to columns
  compute <- function(x1,x2){
    x1/x2*100
  }
  
  #Loop to go through all ines in data2 and apply compute to them
  for(i in 1:nrow(data2)){
    # Identify columns that match of group entry
    col_nums <- str_which(colnames(output), pattern = b[i,names])
    cols <- colnames(output[,..col_nums])
    
    # Update BD columns by applying get_BDpercent to group
    output <- output[, paste0(cols, "") := lapply(.SD, compute, x2 = data2[i,vals]), .SDcols = cols]
  }
  output
}

c <- convert(a,b)

a

应用该函数创建一个新对象c工作正常，但不知何故a也被修改了：a在应用函数之前：

A  mean_one   mean_two mean_three   sd_one     sd_two sd_three
 1:  1  5.401716  2.3178453   7.861245 2.242470 1.38388160 5.973074
 2:  2  6.297607 -5.3536197   8.129933 3.059719 0.82876696 3.888044
 3:  3 -7.335733  2.7596386   6.925848 3.052197 0.83010216 4.467223
 4:  4 13.102340  5.2097568   6.833323 1.695432 1.11115086 4.925649
 5:  5  4.027482 -3.3134371   9.007320 2.478962 1.09693094 3.951823
 6:  6  9.870239  2.7126299   6.162253 1.223240 1.55348307 5.051808
 7:  7  7.820498 -0.8919731   4.783220 1.340520 1.47370370 3.268912
 8:  8 -2.641131 -9.3359900   6.538014 1.947230 0.73272735 4.299064
 9:  9 -1.128654 -2.1083705   8.170297 1.286314 0.07326959 5.110170
10: 10 -2.255471 -2.5927778   7.062032 3.714906 0.40010323 6.131127

应用函数后的a（与c相同，为所需输出）：

A   mean_one   mean_two mean_three   sd_one   sd_two sd_three
 1:  1  10.803432   5.794613   26.20415 4.484940 3.459704 19.91025
 2:  2  12.595214 -13.384049   27.09978 6.119439 2.071917 12.96015
 3:  3 -14.671467   6.899097   23.08616 6.104394 2.075255 14.89074
 4:  4  26.204680  13.024392   22.77774 3.390865 2.777877 16.41883
 5:  5   8.054964  -8.283593   30.02440 4.957924 2.742327 13.17274
 6:  6  19.740477   6.781575   20.54084 2.446479 3.883708 16.83936
 7:  7  15.640997  -2.229933   15.94407 2.681040 3.684259 10.89637
 8:  8  -5.282262 -23.339975   21.79338 3.894460 1.831818 14.33021
 9:  9  -2.257308  -5.270926   27.23432 2.572629 0.183174 17.03390
10: 10  -4.510942  -6.481945   23.54011 7.429812 1.000258 20.43709

虽然这不会是一个问题，如果我从来没有回到a，这个功能是包括在一个闪亮的应用程序和取回这样的修改会搞砸它。
我是不是漏掉了什么？或者也许有一些我不明白的方式，这些列的更新。我的猜测是这个问题与data.table的工作方式有关。
任何帮助都很感激。谢谢！

r

来源：https://stackoverflow.com/questions/76485347/data-table-function-using-the-paste0-cols-and-lapply-feature-rewrites-input