垂直-有侧视功能吗?

hec6srdp  于 2021-06-28  发布在  Hive
关注(0)|答案(1)|浏览(356)

需要旋转一个矩阵来进行时间序列插值/间隙填充,并希望避免混乱和低效的union all方法。vertica是否提供类似hive的侧视图分解功能?
编辑:@marcothesane——谢谢你有趣的场景——我喜欢你的插值方法。我会多玩玩,看看会怎么样。看起来很有希望。
仅供参考——这是我提出的解决方案——我的场景是,我试图通过查询查看一段时间内的内存使用情况(以及用户/资源池等,基本上是试图获得一个成本指标)。我需要做插值,这样我就可以看到在任何时间点的总使用量。所以这里是我的查询,它按秒对时间序列进行切片,然后按分钟聚合得到“兆字节秒”的度量。

  1. with qry_cte as
  2. (
  3. select
  4. session_id
  5. , request_id
  6. , date_trunc('second',start_timestamp) as dat_str
  7. , timestampadd('ss'
  8. , ceiling(request_duration_ms/1000)::int
  9. , date_trunc('second',start_timestamp)
  10. ) as dat_end
  11. , ceiling(request_duration_ms/1000)::int as secs
  12. , memory_acquired_mb
  13. from query_requests
  14. where request_type = 'QUERY'
  15. and request_duration_ms > 0
  16. and memory_acquired_mb > 0
  17. )
  18. select date_trunc('minute',slice_time) as dat_minute
  19. , count(distinct session_id || request_id::varchar) as queries
  20. , sum(memory_acquired_mb) as mb_seconds
  21. from (
  22. select session_id, request_id, slice_time, ts_first_value(memory_acquired_mb) as memory_acquired_mb
  23. from (
  24. select session_id, request_id, dat_str as dat, memory_acquired_mb from qry_cte
  25. union all
  26. select session_id, request_id, dat_end as dat, memory_acquired_mb from qry_cte
  27. ) x
  28. timeseries slice_time as '1 second' over (partition by session_id, request_id order by dat)
  29. ) x
  30. group by 1 order by 1 desc
  31. ;
aor9mmx1

aor9mmx11#

实际上,我手头有一个场景可以满足您的要求:
除此之外:

  1. id|day_strt |sales_01 |sales_02 |sales_03 |sales_04 |sales_05 |sales_06
  2. 1|2016-01-19 08:00:00| 1,842.25| 5,449.40|- |39,776.86|- | 9,424.10
  3. 2|2016-01-19 08:00:00|73,810.66|- | 9,867.70|- |76,723.91|95,605.14

做这个:

  1. id|day_strt |sales_01 |sales_02 |sales_03 |sales_04 |sales_05 |sales_06
  2. 1|2016-01-19 08:00:00| 1,842.25| 5,449.40|22,613.13|39,776.86|24,600.48| 9,424.10
  3. 2|2016-01-19 08:00:00|73,810.66|41,839.18| 9,867.70|43,295.81|76,723.91|95,605.14

01到06是指从08:00开始记录销售额的一天中的第n个小时。
下面是整个场景,包括初始输入数据。
作为选择的输入数据。。联合所有选择。
一种由6个整数组成的表,用来交叉连接到1的表中。
垂直轴:将输入与6个整数交叉连接,根据索引,在case表达式中只输出第n个sales列。最后,过滤掉同一个case表达式计算结果为null的地方。
使用timeseries子句和线性插值来填补空白:销售数字和索引列。
在最终查询中再次水平透视所有内容。
我可以向你保证,这比表中所有列的联合更有效。
下面是:

  1. WITH
  2. -- input
  3. input(id,day_strt,sales_01,sales_02,sales_03,sales_04,sales_05,sales_06) AS (
  4. SELECT 1,'2016-01-19 08:00:00'::TIMESTAMP(0), 1842.25, 5449.40 ,NULL::INT,39776.86 ,NULL::INT, 9424.10
  5. UNION ALL SELECT 2,'2016-01-19 08:00:00'::TIMESTAMP(0),73810.66 ,NULL::INT, 9867.70 ,NULL::INT,76723.91 ,95605.14
  6. )
  7. -- debug
  8. -- SELECT * FROM input;
  9. ,
  10. -- 6 months to pivot vertically -> 6 integers
  11. six_idxs(idx) AS (
  12. SELECT 1
  13. UNION ALL SELECT 2
  14. UNION ALL SELECT 3
  15. UNION ALL SELECT 4
  16. UNION ALL SELECT 5
  17. UNION ALL SELECT 6
  18. )
  19. ,
  20. -- pivot input vertically and remove rows with null measures
  21. -- (could probably add the TIMESERIES clause here directly,
  22. -- but less readable and maintainable)
  23. vert_pivot AS (
  24. SELECT
  25. id
  26. , idx
  27. , TIMESTAMPADD(HOUR,idx-1,day_strt)::TIMESTAMP(0) AS sales_ts
  28. , CASE idx
  29. WHEN 1 THEN sales_01
  30. WHEN 2 THEN sales_02
  31. WHEN 3 THEN sales_03
  32. WHEN 4 THEN sales_04
  33. WHEN 5 THEN sales_05
  34. WHEN 6 THEN sales_06
  35. END AS sales
  36. FROM input
  37. CROSS JOIN six_idxs
  38. WHERE (
  39. CASE idx
  40. WHEN 1 THEN sales_01
  41. WHEN 2 THEN sales_02
  42. WHEN 3 THEN sales_03
  43. WHEN 4 THEN sales_04
  44. WHEN 5 THEN sales_05
  45. WHEN 6 THEN sales_06
  46. END
  47. ) IS NOT NULL
  48. )
  49. -- debug:
  50. -- SELECT * FROM vert_pivot;
  51. ,
  52. -- gap filling and interpolation
  53. gaps_filled AS (
  54. SELECT
  55. id
  56. , TS_FIRST_VALUE(idx,'LINEAR') AS idx
  57. , tm_sales_ts::TIMESTAMP(0) AS sales_ts
  58. , TS_FIRST_VALUE(sales,'LINEAR') AS sales
  59. FROM vert_pivot
  60. TIMESERIES tm_sales_ts AS '1 HOUR' OVER(
  61. PARTITION BY id ORDER BY sales_ts
  62. )
  63. )
  64. -- debug
  65. -- SELECT * FROM gaps_filled ORDER BY 1,2;
  66. -- pivot horizontally; final query
  67. SELECT
  68. id
  69. , MIN(sales_ts) AS day_strt
  70. , SUM(CASE idx WHEN 1 THEN sales END)::NUMERIC(7,2) AS sales_01
  71. , SUM(CASE idx WHEN 2 THEN sales END)::NUMERIC(7,2) AS sales_02
  72. , SUM(CASE idx WHEN 3 THEN sales END)::NUMERIC(7,2) AS sales_03
  73. , SUM(CASE idx WHEN 4 THEN sales END)::NUMERIC(7,2) AS sales_04
  74. , SUM(CASE idx WHEN 5 THEN sales END)::NUMERIC(7,2) AS sales_05
  75. , SUM(CASE idx WHEN 6 THEN sales END)::NUMERIC(7,2) AS sales_06
  76. FROM gaps_filled
  77. GROUP BY id
  78. ORDER BY id
  79. ;

玩得开心-
理智的马可

展开查看全部

相关问题