fabletools::skill_score是否考虑目标变量的转换?

up9lanfz  于 2023-11-14  发布在  其他
关注(0)|答案(1)|浏览(81)

在使用fable测试一些模型的准确性时,我发现了fabletools::skill_score的一个有趣的行为。skill_score在FPP 3书中有描述。如果您计算一组模型的测试准确性,这些模型包括一个带有skill_score(CRPS)的NAIVE/SNAIVE模型,并且没有对目标变量进行转换,NAIVE/SNAIVE模型的skill_score为0。这与FPP 3手册中的描述一致:
与基于CRPS的原始方法相比,......方法的改进比例
但是,如果您以某种方式转换目标变量(例如log(x + 1)),则NAIVE/SNAIVE模型的skill_score值不会为0。这向我表明skill_score函数可能不支持目标变量的转换。我查看了source code,但没有看到任何对转换的引用。
这是skill_score的预期行为吗?如果是,是否有方法将转换继续到skill_score?或者skill_score不适用于目标变量已转换的模型?
此代码在未转换的数据上复制skill_score的预期行为:

library(fpp3)

google_stock <- gafa_stock |>
  filter(Symbol == "GOOG", year(Date) >= 2015) |>
  mutate(day = row_number()) |>
  update_tsibble(index = day, regular = TRUE)

google_stock |> 
  autoplot()

test <- google_stock |> 
  slice_tail(prop = .8)

train <- google_stock |> 
  anti_join(test)

fitted_model <- train |> 
  model(
    Mean = MEAN(Close),
    `Naïve` = NAIVE(Close),
    Drift = NAIVE(Close ~ drift())
  )

goog_fc <- fitted_model |> 
  forecast(h = 12)

fc_acc <- goog_fc |> 
  accuracy(google_stock,
           measures = list(point_accuracy_measures, distribution_accuracy_measures, crps_skill = skill_score(CRPS))) |> 
  select(.model, .type, CRPS, crps_skill, RMSSE)

fc_acc
# A tibble: 3 × 5
  .model .type  CRPS crps_skill RMSSE
  <chr>  <chr> <dbl>      <dbl> <dbl>
1 Drift  Test   38.2     0.0955  5.09
2 Mean   Test  109.     -1.59   12.6 
3 Naïve  Test   42.2     0       5.49

字符串
这将复制非预期行为,使用log(x + 1)转换的相同数据:

fitted_model_transformed <- train |> 
  model(
    Mean = MEAN(log(Close + 1)),
    `Naïve` = NAIVE(log(Close + 1)),
    Drift = NAIVE(log(Close + 1) ~ drift())
  )

goog_fc_transformed <- fitted_model_transformed |> 
  forecast(h = 12)

fc_acc_transformed <- goog_fc_transformed |> 
  accuracy(google_stock,
           measures = list(point_accuracy_measures, distribution_accuracy_measures, crps_skill = skill_score(CRPS))) |> 
  select(.model, .type, CRPS, crps_skill, RMSSE)

fc_acc_transformed
# A tibble: 3 × 5
  .model .type  CRPS crps_skill RMSSE
  <chr>  <chr> <dbl>      <dbl> <dbl>
1 Drift  Test   36.3     0.140   4.97
2 Mean   Test  110.     -1.61   12.6 
3 Naïve  Test   40.8     0.0353  5.42


我希望朴素模型crps_skill为0,因为它不能自我改进。

> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] fable_0.3.3       feasts_0.3.1      fabletools_0.3.4  tsibbledata_0.4.1 tsibble_1.1.3     ggplot2_3.4.3     lubridate_1.9.2  
 [8] tidyr_1.3.0       dplyr_1.1.3       tibble_3.2.1      fpp3_0.5         

loaded via a namespace (and not attached):
 [1] rappdirs_0.3.3       plotly_4.10.2        utf8_1.2.4           generics_0.1.3       anytime_0.3.9        digest_0.6.33       
 [7] magrittr_2.0.3       grid_4.3.1           timechange_0.2.0     pkgload_1.3.2.1      fastmap_1.1.1        jsonlite_1.8.7      
[13] modeldata_1.2.0      httr_1.4.7           purrr_1.0.2          fansi_1.0.5          viridisLite_0.4.2    scales_1.2.1        
[19] numDeriv_2016.8-1.1  textshaping_0.3.6    lazyeval_0.2.2       cli_3.6.1            rlang_1.1.1          crayon_1.5.2        
[25] ellipsis_0.3.2       munsell_0.5.0        withr_2.5.1          tools_4.3.1          colorspace_2.1-0     vctrs_0.6.4         
[31] R6_2.5.1             lifecycle_1.0.3      htmlwidgets_1.6.2    ragg_1.2.5           pkgconfig_2.0.3      progressr_0.14.0    
[37] pillar_1.9.0         gtable_0.3.4         rsconnect_1.1.0      data.table_1.14.8    glue_1.6.2           Rcpp_1.0.11         
[43] systemfonts_1.0.4    tidyselect_1.2.0     rstudioapi_0.15.0    farver_2.1.1         htmltools_0.5.6      labeling_0.4.3      
[49] compiler_4.3.1       distributional_0.3.2

2skhul33

2skhul331#

你可以在同一个model()调用中使用几个不同的转换,所以skill_score()使用一个没有转换的基准测试模型是没有意义的。否则,不同模型的分数可能使用不同的基准测试。因此,基准测试Naive方法必须使用一个未转换的变量。

相关问题