mysql 用于计算活动作业数的查询

vs91vp4v  于 2023-11-16  发布在  Mysql
关注(0)|答案(3)|浏览(120)

有一个jobs表,
| JobID|状态|时间戳|
| --|--|--|
| 1 |inProgress| 2023-11-02 10:39:37|
| 1 |一些随机日志|2023-11-02 10:40:37|
| 2 |inProgress| 2023-11-02 10:43:37|
| 1 |inProgress| 2023-11-02 10:44:37|
| 1 |inProgress| 2023-11-02 10:45:37|
| 1 |inProgress| 2023-11-02 10:50:00|
| 1 |端|2023-11-02 11:40:37|
| 1 |inProgress| 2023-11-02 11:43:37|
| 1 |inProgress| 2023-11-02 11:44:37|
| 1 |端|2023-11-02 11:45:37|
| 2 |一些随机日志|2023-11-02 11:50:37|
| 2 |端|2023-11-02 12:00:37|
我想
1.计算有多少作业正在进行(即已经开始但尚未结束)
我最初以为可以计算count(state=inProgress)-count(state=end)来获得inProgress作业的数量,但inProgress可能不只出现一次。
1.获取每个已完成作业的执行持续时间
请注意,一个作业可以执行多次,例如jobID1从开始到结束执行两次,因此需要为jobID1计算两个执行持续时间。
jobID1执行持续时间为time diff between 2023-11-02 11:40:37, 2023-11-02 10:39:37time diff between 2023-11-02 11:45:37, 2023-11-02 11:43:37
jobID2执行持续时间为time diff between 2023-11-02 12:00:37, 2023-11-02 10:43:37

kgsdhlau

kgsdhlau1#

CREATE TABLE jobs(
  jobID INT,
  state VARCHAR(50),
  timestamp datetime);

INSERT INTO jobs VALUES
(1, 'inProgress', '2023-11-02 10:39:37'),
(1, 'some random log', '2023-11-02 10:40:37'),
(2, 'inProgress', '2023-11-02 10:43:37'),
(1, 'inProgress', '2023-11-02 10:44:37'),
(1, 'inProgress', '2023-11-02 10:45:37'),
(1, 'inProgress', '2023-11-02 10:50:00'),
(1, 'end', '2023-11-02 11:40:37'),
(1, 'inProgress', '2023-11-02 11:43:37'),
(1, 'inProgress', '2023-11-02 11:44:37'),
(1, 'end', '2023-11-02 11:45:37'),
(2, 'some random log', '2023-11-02 11:50:37'),
(2, 'end', '2023-11-02 12:00:37');

字符串
根据上面的数据示例,这里有一个建议:

WITH end_state AS (
  SELECT ROW_NUMBER() OVER (ORDER BY jobID, timestamp) AS Rsnum,
         jobID, timestamp AS end_time
  FROM jobs 
  WHERE state='end'),
  es_wDelimiter AS (
SELECT e1.*, 
       IF(e2.end_time < e1.end_time, e2.end_time,NULL) AS esDelimiter
FROM end_state e1
LEFT JOIN end_state e2 ON e1.jobID=e2.jobID AND e1.end_time <> e2.end_time)

SELECT Rsnum,
       j.jobID,
       MIN(timestamp) AS StartTime,
       MAX(end_time) AS EndTime,
       TIMEDIFF(MAX(end_time),MIN(timestamp)) AS ExecutionTime
FROM jobs j 
 LEFT JOIN es_wDelimiter e
ON j.jobID=e.jobID 
AND j.timestamp > IFNULL(esDelimiter,0)
AND j.timestamp <= e.end_time
GROUP BY Rsnum, j.jobID
ORDER BY RsNum, j.jobID;


查询操作如下:
1.获取所有state=end的时间戳,并分配行号-行号用作新组。
1.将第一个查询 Package 为公共表表达式,然后执行另一个查询以创建自定义表。
1.将第二个查询 Package 为另一个公共表表达式,并使用具有以下条件的作业表对其执行LEFT JOIN查询:

  • jobs.timestamp的值大于之前设置的NULL值(如果NULL值为NULL,则会分配0)。
  • jobs.timestamp值小于或等于我们从第一个查询中提取的end_time值。

这里有一个小提琴供参考:https://dbfiddle.uk/RBEHLuji

cidc1ykv

cidc1ykv2#

对于可能具有多个示例的作业,以下方法“向后看”以定位相同作业id的任何先前结束,jobid的每个示例也是使用row_number()分配的。然后,这些表示“结束”的行与表示“进行中”的任何行连接,条件是确保它们适合“前一个结束”(或NULL)和“此结束”然后计算每个作业示例的最小开始,并且还使持续时间(以秒为单位)可用。

WITH e AS (
    SELECT
          id
        , jobid
        , LAG(TIMESTAMP) OVER (PARTITION BY jobID ORDER BY TIMESTAMP) AS prev_end
        , TIMESTAMP AS end_time
        , row_number() OVER (PARTITION BY jobid ORDER BY TIMESTAMP) AS job_inst
    FROM jobs
    WHERE STATE = 'end'
    )
SELECT
      e.jobid
    , e.job_inst
    , min(p.start_time) start_time
    , e.end_time
    , TIMESTAMPDIFF(SECOND, min(p.start_time), e.end_time) AS du_seconds
FROM e
INNER JOIN (
    SELECT
          id
        , jobID
        , TIMESTAMP AS start_time
    FROM jobs
    WHERE STATE = 'inProgress'
    ) p ON e.jobid = p.jobid 
      AND (
               (p.start_time < e.end_time and e.prev_end IS NULL)
             OR
               (p.start_time > e.prev_end and p.start_time < e.end_time)
             )
GROUP BY
     e.jobid
   , e.job_inst
   , e.end_time
ORDER BY
     e.jobid
   , e.job_inst

字符串
| JobID|工作指令|开始时间|结束时间|度秒|
| --|--|--|--|--|
| 1 | 1 |2023-11-02 10:39:37| 2023-11-02 11:40:37| 3660 |
| 1 | 2 |2023-11-02 11:43:37| 2023-11-02 11:45:37| 120 |
| 2 | 1 |2023-11-02 10:43:37| 2023-11-02 12:00:37| 4620 |
请参阅fiddle
一旦你有了作业和它们的示例,最小/最大开始/结束和总持续时间可以很容易地从上面看到的结果计算出来(也可以选择显示max(job_inst))。

SELECT
      jobid
    , MAX(job_inst)
    , MIN(start_time)
    , MAX(end_time)
    , SUM(du_seconds)
FROM (
       { result from above }
     ) d
GROUP BY
      jobid

bqf10yzr

bqf10yzr3#

要计算正在进行的作业(已开始但尚未完成)的数量,请执行以下操作:

SELECT COUNT(DISTINCT jobID) AS inProgressJobs
    FROM jobs
    WHERE jobID NOT IN (
        SELECT jobID
        FROM jobs
        WHERE state = 'end'
    );

字符串
要计算每个已完成作业的执行持续时间,请执行以下操作:

SELECT jobID, 
       MIN(CASE WHEN state = 'end' THEN timestamp END) AS end_time,
       MAX(CASE WHEN state = 'inProgress' THEN timestamp END) AS start_time
FROM jobs
GROUP BY jobID, state
HAVING start_time IS NOT NULL AND end_time IS NOT NULL;

相关问题