R语言 bsts -时间序列趋势检验

ss2ws0br  于 2023-06-03  发布在  其他
关注(0)|答案(2)|浏览(220)

我正在使用bsts软件包来分析几个时间序列,以找出序列中的值是否在时间段内增加,减少或保持稳定。
我看到我可以从模型中提取一个趋势分量,它看起来像是在增加(下面的代码)。但我如何才能正式测试趋势的方向呢?我是否应该删除局部线性趋势,而不是对时间进行回归?或者有没有一种方法可以使用已经提取的趋势分量来回答这个问题?
以下是其中一个时间序列:

Jan      Feb      Mar      Apr      May      Jun      Jul      Aug      Sep      Oct      Nov      Dec
2013 5.454450 5.351702 5.450414 5.490382 5.367216 5.404927 5.267974 5.211403 5.439326 5.394489 5.425333 5.413367
2014 5.365040 5.359957 5.388980 5.272279 5.337346 5.383114 5.211803 5.567671 5.446918 5.427461 5.490386 5.429216
2015 5.522443 5.432766 5.581413 5.307201 5.452850 5.426128 5.372678 5.524919 5.400167 5.453443 5.596288 5.424220
2016 5.633506 5.553917 5.553649 5.444297 5.551022 5.461216 5.503183 5.426033 5.454331 5.643385 5.575780 5.486073
2017 5.656180 5.411885 5.477615 5.437005 5.434144 5.404344 5.481005 5.437255 5.352800 5.485022 5.441534 5.472936
2018 5.366347 5.381583 5.401431 5.479976 5.439315 5.319484 5.421672 5.319448 5.574673 5.472161 5.479900 5.539380
2019 5.390139 5.429426 5.613977 5.343529 5.487940 5.624361 5.381285 5.366611 5.565688 5.503865 5.491821 5.486168
2020 5.475343 5.493866 6.010556 5.690947 5.656557 5.420500 5.453484 5.566972 5.369799 5.435967 5.613358 5.409345

我用来提取趋势的代码(从https://multithreaded.stitchfix.com/blog/2016/04/21/forget-arima/),y是上面的时间序列:

ss <- AddLocalLinearTrend(list(), y)
ss <- AddSeasonal(ss, y, nseasons = 12)
bsts.model <- bsts(y, state.specification = ss, niter = 500, ping=0, seed=2016)

burn <- SuggestBurn(0.1, bsts.model)

components <- cbind.data.frame(
  colMeans(bsts.model$state.contributions[-(1:burn),"trend",]),                               
  colMeans(bsts.model$state.contributions[-(1:burn),"seasonal.12.1",]),
  as.Date(time(Y)))  

names(components) <- c("Trend", "Seasonality", "Date")
components <- melt(components, id="Date")
names(components) <- c("Date", "Component", "Value")

ggplot(data=components, aes(x=Date, y=Value)) + geom_line() + 
  theme_bw() + theme(legend.title = element_blank()) + ylab("") + xlab("") + 
  facet_grid(Component ~ ., scales="free") + guides(colour=FALSE) + 
  theme(axis.text.x=element_text(angle = -90, hjust = 0))
ztmd8pv5

ztmd8pv51#

这可以通过更仔细地检查图纸来完成。
让我们把这个拆开:

colMeans(bsts.model$state.contributions[-(1:burn), "trend",])

bsts.model$state.contributions是形状为(iter, metric, obs)的数组。
bsts.model$state.contributions[-(1:burn), "trend", ]将其转换为(non_burnin_iter, obs)矩阵。
colMeans的调用获得平均趋势,但您也可以提取任何分位数或属性份额:

m <- bsts.model$state.contributions[-(1:burn), "trend", ]

# Average trend, what you are getting now.
colMeans(m)  

# Posterior probability that the effect of the local trend is positive
# Probably what you want though note that it's conceptually not the
# same as a p-value. It's more what people think 1-pvalue is.
colMeans(m > 0)  

# Lower bound of the 95% credible interval of the trend.
# Useful if you have a region or practical equivalence (ROPE).
apply(m, 2, quantile, p = 0.025)
uklbhaso

uklbhaso2#

我想到了一个解决方案(代码如下),请让我知道它是否有意义。
基本上,对于时间序列中的每个点,我:
1.检查95%置信区间的下限是否大于其之前任何点的上限。
1.检查95%置信区间的上限是否小于其之前任何点的下限。
这样,如果时间序列是稳定的,那么1和2都应该很小。如果它是增加的,那么1应该大于2。如果它是递减的,那么2应该大于1。如果它是波动的,那么1和2都应该是大的。
假设1和2之间的差值小于时间序列长度的10%,我认为它是稳定的。
有什么原因会给予我得出错误的结果吗?

components <- cbind.data.frame(
          apply(bsts.model$state.contributions[-(1:burn),"trend",], 2, quantile, p = 0.025),
          apply(bsts.model$state.contributions[-(1:burn),"trend",], 2, quantile, p = 0.925))  
        names(components) <- c("Trend_lci", "Trend_uci")
        
        test_increasing <-
          sapply(1:nrow(components), function(i){ 
            
            any(components$Trend_lci[i] > components$Trend_uci[1:i])
            
          }) %>% sum
        
        test_decreasing <-
          sapply(1:nrow(components), function(i){ 
            
            any(components$Trend_uci[i] < components$Trend_lci[1:i])
            
          }) %>% sum
        
        test_increasing - test_decreasing

相关问题