如何用for循环重写mutate + case_when语句(R)

ztigrdn8  于 2023-05-04  发布在  其他
关注(0)|答案(1)|浏览(136)

我正在做一个项目,该项目将多个ARIMA模型分配给我的数据,这些数据由一个分组变量分割。下面是一个可重复的示例,它按预期运行:

#Load libraries
library(dplyr)
library(workflows)
library(tidyr)
library(recipes)
library(parsnip)
library(tidymodels)
library(modeltime)
library(purrr)

#Set up data
df <- m750 %>%
  mutate(year = lubridate::year(date)) %>%
  filter(year>=2007 & year <=2009) %>%
  select(c(-id))

#Nest data by year
df_nest <- df %>%
  drop_na() %>%
  nest(data_full = c(-year))

#Create recipes
rec_year_list <- list()
num_years <- length(unique(df$year))

for (i in 1:num_years){
  rec_year_list[[i]] <- recipe(value ~ date, data = df_nest$data_full[[i]])
}

#Assign recipes to workflows
wfl_list <- workflow()

for (i in 1:num_years){
  
  wfl_list[[i]] <- wfl_list %>%
    add_recipe(rec_year_list[[i]]) %>%
    add_model(
      arima_reg() %>%
        set_engine(engine='auto_arima')
    )
}

#Assign workflows to data
df_nest <- df_nest %>%
  mutate(workflow = case_when(year==2007 ~ list(wfl_list[[1]]),
                              year==2008 ~ list(wfl_list[[2]]),
                              year==2009 ~ list(wfl_list[[3]])
                              )
         )

在这个例子中,使用mutate和case_when函数并不是什么大问题,因为我的分组变量只有3个值。然而,在我的实际数据中,我对分组变量有很多值(即:〉3000)。我如何将最后一段代码重写为for循环,以将wfl_list的元素正确分配给适当的分组变量,作为df_nest中新列的值。
任何帮助将不胜感激!谢谢你!

bf1o4zei

bf1o4zei1#

试试看

df_nest$workflow <- vector('list', nrow(df_nest))
for(i in seq_len(nrow(df_nest))) 
    df_nest$workflow[i] <- list(wfl_list[[i]])
  • 输出
> df_nest
# A tibble: 3 × 3
   year data_full         workflow  
  <dbl> <list>            <list>    
1  2007 <tibble [12 × 2]> <workflow>
2  2008 <tibble [12 × 2]> <workflow>
3  2009 <tibble [12 × 2]> <workflow>

相关问题