在使用fable
测试一些模型的准确性时,我发现了fabletools::skill_score
的一个有趣的行为。skill_score在FPP 3书中有描述。如果您计算一组模型的测试准确性,这些模型包括一个带有skill_score(CRPS)的NAIVE/SNAIVE模型,并且没有对目标变量进行转换,NAIVE/SNAIVE模型的skill_score为0。这与FPP 3手册中的描述一致:
与基于CRPS的原始方法相比,......方法的改进比例
但是,如果您以某种方式转换目标变量(例如log(x + 1)
),则NAIVE/SNAIVE模型的skill_score值不会为0。这向我表明skill_score函数可能不支持目标变量的转换。我查看了source code,但没有看到任何对转换的引用。
这是skill_score的预期行为吗?如果是,是否有方法将转换继续到skill_score?或者skill_score不适用于目标变量已转换的模型?
此代码在未转换的数据上复制skill_score的预期行为:
library(fpp3)
google_stock <- gafa_stock |>
filter(Symbol == "GOOG", year(Date) >= 2015) |>
mutate(day = row_number()) |>
update_tsibble(index = day, regular = TRUE)
google_stock |>
autoplot()
test <- google_stock |>
slice_tail(prop = .8)
train <- google_stock |>
anti_join(test)
fitted_model <- train |>
model(
Mean = MEAN(Close),
`Naïve` = NAIVE(Close),
Drift = NAIVE(Close ~ drift())
)
goog_fc <- fitted_model |>
forecast(h = 12)
fc_acc <- goog_fc |>
accuracy(google_stock,
measures = list(point_accuracy_measures, distribution_accuracy_measures, crps_skill = skill_score(CRPS))) |>
select(.model, .type, CRPS, crps_skill, RMSSE)
fc_acc
# A tibble: 3 × 5
.model .type CRPS crps_skill RMSSE
<chr> <chr> <dbl> <dbl> <dbl>
1 Drift Test 38.2 0.0955 5.09
2 Mean Test 109. -1.59 12.6
3 Naïve Test 42.2 0 5.49
字符串
这将复制非预期行为,使用log(x + 1)转换的相同数据:
fitted_model_transformed <- train |>
model(
Mean = MEAN(log(Close + 1)),
`Naïve` = NAIVE(log(Close + 1)),
Drift = NAIVE(log(Close + 1) ~ drift())
)
goog_fc_transformed <- fitted_model_transformed |>
forecast(h = 12)
fc_acc_transformed <- goog_fc_transformed |>
accuracy(google_stock,
measures = list(point_accuracy_measures, distribution_accuracy_measures, crps_skill = skill_score(CRPS))) |>
select(.model, .type, CRPS, crps_skill, RMSSE)
fc_acc_transformed
# A tibble: 3 × 5
.model .type CRPS crps_skill RMSSE
<chr> <chr> <dbl> <dbl> <dbl>
1 Drift Test 36.3 0.140 4.97
2 Mean Test 110. -1.61 12.6
3 Naïve Test 40.8 0.0353 5.42
型
我希望朴素模型crps_skill为0,因为它不能自我改进。
> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] fable_0.3.3 feasts_0.3.1 fabletools_0.3.4 tsibbledata_0.4.1 tsibble_1.1.3 ggplot2_3.4.3 lubridate_1.9.2
[8] tidyr_1.3.0 dplyr_1.1.3 tibble_3.2.1 fpp3_0.5
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 plotly_4.10.2 utf8_1.2.4 generics_0.1.3 anytime_0.3.9 digest_0.6.33
[7] magrittr_2.0.3 grid_4.3.1 timechange_0.2.0 pkgload_1.3.2.1 fastmap_1.1.1 jsonlite_1.8.7
[13] modeldata_1.2.0 httr_1.4.7 purrr_1.0.2 fansi_1.0.5 viridisLite_0.4.2 scales_1.2.1
[19] numDeriv_2016.8-1.1 textshaping_0.3.6 lazyeval_0.2.2 cli_3.6.1 rlang_1.1.1 crayon_1.5.2
[25] ellipsis_0.3.2 munsell_0.5.0 withr_2.5.1 tools_4.3.1 colorspace_2.1-0 vctrs_0.6.4
[31] R6_2.5.1 lifecycle_1.0.3 htmlwidgets_1.6.2 ragg_1.2.5 pkgconfig_2.0.3 progressr_0.14.0
[37] pillar_1.9.0 gtable_0.3.4 rsconnect_1.1.0 data.table_1.14.8 glue_1.6.2 Rcpp_1.0.11
[43] systemfonts_1.0.4 tidyselect_1.2.0 rstudioapi_0.15.0 farver_2.1.1 htmltools_0.5.6 labeling_0.4.3
[49] compiler_4.3.1 distributional_0.3.2
型
1条答案
按热度按时间2skhul331#
你可以在同一个
model()
调用中使用几个不同的转换,所以skill_score()
使用一个没有转换的基准测试模型是没有意义的。否则,不同模型的分数可能使用不同的基准测试。因此,基准测试Naive方法必须使用一个未转换的变量。