R复制数据框行中的数据,但将一列值减去1

gxwragnw  于 2023-03-27  发布在  其他
关注(0)|答案(5)|浏览(132)

我正在尝试为预测分析创建假设数据。我有一组平均值,打算将预测模型应用于这些平均值。但是,我想使用Agent_Count列中的各种值扩展DF。是否有办法将第2-4列复制到新行,同时将这些新行中第1列的值更改为新的假设值。我想这样做20-30次。
这是我的数据。

Group             Agent_Count HOUR Answered_Calls Aban Total_Calls
Clinical Support    11.75      9    52.69         2.77  56.65
PW Reset            12.06      9    53.79         22.27 81.98
Technical Support   21.15      9    81.02         2.22  84.20
4szc88ey

4szc88ey1#

您可以使用以下命令作为一行程序来完成此操作

do.call(rbind, lapply(0:9, \(x) within(df, Agent_Count <- x + Agent_Count)))
#>                Group Agent_Count HOUR Answered_Calls  Aban Total_Calls
#> 1   Clinical Support       11.75    9          52.69  2.77       56.65
#> 2           PW Reset       12.06    9          53.79 22.27       81.98
#> 3  Technical Support       21.15    9          81.02  2.22       84.20
#> 11  Clinical Support       12.75    9          52.69  2.77       56.65
#> 21          PW Reset       13.06    9          53.79 22.27       81.98
#> 31 Technical Support       22.15    9          81.02  2.22       84.20
#> 12  Clinical Support       13.75    9          52.69  2.77       56.65
#> 22          PW Reset       14.06    9          53.79 22.27       81.98
#> 32 Technical Support       23.15    9          81.02  2.22       84.20
#> 13  Clinical Support       14.75    9          52.69  2.77       56.65
#> 23          PW Reset       15.06    9          53.79 22.27       81.98
#> 33 Technical Support       24.15    9          81.02  2.22       84.20
#> 14  Clinical Support       15.75    9          52.69  2.77       56.65
#> 24          PW Reset       16.06    9          53.79 22.27       81.98
#> 34 Technical Support       25.15    9          81.02  2.22       84.20
#> 15  Clinical Support       16.75    9          52.69  2.77       56.65
#> 25          PW Reset       17.06    9          53.79 22.27       81.98
#> 35 Technical Support       26.15    9          81.02  2.22       84.20
#> 16  Clinical Support       17.75    9          52.69  2.77       56.65
#> 26          PW Reset       18.06    9          53.79 22.27       81.98
#> 36 Technical Support       27.15    9          81.02  2.22       84.20
#> 17  Clinical Support       18.75    9          52.69  2.77       56.65
#> 27          PW Reset       19.06    9          53.79 22.27       81.98
#> 37 Technical Support       28.15    9          81.02  2.22       84.20
#> 18  Clinical Support       19.75    9          52.69  2.77       56.65
#> 28          PW Reset       20.06    9          53.79 22.27       81.98
#> 38 Technical Support       29.15    9          81.02  2.22       84.20
#> 19  Clinical Support       20.75    9          52.69  2.77       56.65
#> 29          PW Reset       21.06    9          53.79 22.27       81.98
#> 39 Technical Support       30.15    9          81.02  2.22       84.20

创建于2023-03-21带有reprex v2.0.2

muk1a3rh

muk1a3rh2#

dplyr/purrr的选项:

library(dplyr)
library(purrr)

df <- tibble::tribble(
  ~Group, ~Agent_Count, ~HOUR, ~Answered_Calls, ~Aban, ~Total_Calls,
  "Clinical Support",        11.75,    9L,           52.69,  2.77,        56.65,
  "PW Reset",        12.06,    9L,           53.79, 22.27,        81.98,
  "Technical Support",        21.15,    9L,           81.02,  2.22,         84.2
)

  
map_df(0:30, function(x) {df %>% mutate(across(Agent_Count:Answered_Calls, ~. + x))})

#> # A tibble: 93 × 6
#>    Group             Agent_Count  HOUR Answered_Calls  Aban Total_Calls
#>    <chr>                   <dbl> <int>          <dbl> <dbl>       <dbl>
#>  1 Clinical Support         11.8     9           52.7  2.77        56.6
#>  2 PW Reset                 12.1     9           53.8 22.3         82.0
#>  3 Technical Support        21.2     9           81.0  2.22        84.2
#>  4 Clinical Support         12.8    10           53.7  2.77        56.6
#>  5 PW Reset                 13.1    10           54.8 22.3         82.0
#>  6 Technical Support        22.2    10           82.0  2.22        84.2
#>  7 Clinical Support         13.8    11           54.7  2.77        56.6
#>  8 PW Reset                 14.1    11           55.8 22.3         82.0
#>  9 Technical Support        23.2    11           83.0  2.22        84.2
#> 10 Clinical Support         14.8    12           55.7  2.77        56.6
#> # … with 83 more rows

或者,如果只需要增加Agent_Count

map_df(0:30, function(x) {df %>% mutate(Agent_Count = Agent_Count + x)})

#> # A tibble: 93 × 6
#>    Group             Agent_Count  HOUR Answered_Calls  Aban Total_Calls
#>    <chr>                   <dbl> <int>          <dbl> <dbl>       <dbl>
#>  1 Clinical Support         11.8     9           52.7  2.77        56.6
#>  2 PW Reset                 12.1     9           53.8 22.3         82.0
#>  3 Technical Support        21.2     9           81.0  2.22        84.2
#>  4 Clinical Support         12.8     9           52.7  2.77        56.6
#>  5 PW Reset                 13.1     9           53.8 22.3         82.0
#>  6 Technical Support        22.2     9           81.0  2.22        84.2
#>  7 Clinical Support         13.8     9           52.7  2.77        56.6
#>  8 PW Reset                 14.1     9           53.8 22.3         82.0
#>  9 Technical Support        23.2     9           81.0  2.22        84.2
#> 10 Clinical Support         14.8     9           52.7  2.77        56.6
#> # … with 83 more rows
r3i60tvu

r3i60tvu3#

下面是一个tidyverse方法:

library(purrr)
library(dplyr)

map_df(0:9, ~mutate(df, Agent_Count = rnorm(n(), 12, 2)))
Group Agent_Count HOUR Answered_Calls  Aban Total_Calls
1   Clinical Support   13.268501    9          52.69  2.77       56.65
2           PW Reset   12.847290    9          53.79 22.27       81.98
3  Technical Support   11.596324    9          81.02  2.22       84.20
4   Clinical Support   11.846268    9          52.69  2.77       56.65
5           PW Reset   13.374728    9          53.79 22.27       81.98
6  Technical Support   12.343263    9          81.02  2.22       84.20
7   Clinical Support   10.339783    9          52.69  2.77       56.65
8           PW Reset   11.419682    9          53.79 22.27       81.98
9  Technical Support    9.361749    9          81.02  2.22       84.20
10  Clinical Support   10.065936    9          52.69  2.77       56.65
11          PW Reset   11.710778    9          53.79 22.27       81.98
12 Technical Support    8.403735    9          81.02  2.22       84.20
13  Clinical Support    8.622915    9          52.69  2.77       56.65
14          PW Reset   14.205130    9          53.79 22.27       81.98
15 Technical Support   10.846762    9          81.02  2.22       84.20
16  Clinical Support    8.296617    9          52.69  2.77       56.65
17          PW Reset   11.774274    9          53.79 22.27       81.98
18 Technical Support   14.642139    9          81.02  2.22       84.20
19  Clinical Support   13.324509    9          52.69  2.77       56.65
20          PW Reset   12.882766    9          53.79 22.27       81.98
21 Technical Support   14.367492    9          81.02  2.22       84.20
22  Clinical Support   10.456997    9          52.69  2.77       56.65
23          PW Reset   13.459378    9          53.79 22.27       81.98
24 Technical Support   10.825829    9          81.02  2.22       84.20
25  Clinical Support   12.001528    9          52.69  2.77       56.65
26          PW Reset   16.428931    9          53.79 22.27       81.98
27 Technical Support   13.938869    9          81.02  2.22       84.20
28  Clinical Support   13.536015    9          52.69  2.77       56.65
29          PW Reset    9.783344    9          53.79 22.27       81.98
rwqw0loc

rwqw0loc4#

在底数R中,可以使用repeat()函数将同一 Dataframe “扩展”n次:

reps <- 30 # repeat it 30 times
df_big <- do.call(rbind, replicate(reps, df, simplify = FALSE))

然后,您可以用新值替换新行。例如,要用前一个值的+1替换第4行及以后的行:

for(i in seq_len(nrow(df_big))){
  if(!(i %in% 1:3)){
  df_big[i,"Agent_Count"] <- df_big[i-1,"Agent_Count"]+1
}}

              Group Agent_Count HOUR Answered_Calls  Aban Total_Calls
Clinical    Support       11.75    9          52.69  2.77       56.65
PW            Reset       12.06    9          53.79 22.27       81.98
Technical   Support       21.15    9          81.02  2.22       84.20
Clinical1   Support       22.15    9          52.69  2.77       56.65
PW1           Reset       23.15    9          53.79 22.27       81.98
Technical1  Support       24.15    9          81.02  2.22       84.20
Clinical2   Support       25.15    9          52.69  2.77       56.65
PW2           Reset       26.15    9          53.79 22.27       81.98
Technical2  Support       27.15    9          81.02  2.22       84.20
Clinical3   Support       28.15    9          52.69  2.77       56.65
......

或者用随机数替换这些值(这里,从平均值= 11且sd = 5的正态分布生成):

df_big[-c(1:3), "Agent_Count"] <- rnorm(reps*3-3, mean = 11, sd = 5)

              Group Agent_Count HOUR Answered_Calls  Aban Total_Calls
Clinical    Support  11.7500000    9          52.69  2.77       56.65
PW            Reset  12.0600000    9          53.79 22.27       81.98
Technical   Support  21.1500000    9          81.02  2.22       84.20
Clinical1   Support   6.9110877    9          52.69  2.77       56.65
PW1           Reset  10.7298854    9          53.79 22.27       81.98
Technical1  Support  12.6507080    9          81.02  2.22       84.20
Clinical2   Support  15.7766231    9          52.69  2.77       56.65
......
but5z9lq

but5z9lq5#

你可以尝试一个for循环的样本…这是我所能做的没有更多的信息

OUTPUT<-NULL
for (i in 1:10){#repeat 30 times
  for(j in c("Clinical Support","PW Reset","Technical Support")){# repeat once for each group
    for (k in 8:18){#repeat for each hour
      Total_Calls <- sample(20:60,1) #sample total calls
      Answered_Calls <-sample(20:length(Total_Calls),1)#sample nswered calls
      Aban <- Total_Calls - Answered_Calls
      OUTPUT <- rbind(
        OUTPUT,
        data.frame(
          DAY = i,
          Group=j,
          Agent_Count=sample(2:10,1),
          HOUR = k,
          Answered_Calls=Answered_Calls,
          Aban=Aban,
          Total_Calls=Total_Calls
        )
      )
    }
  }
}
print(OUTPUT)

相关问题