MySQL 5.7完整外部连接

bvjveswy  于 2023-03-28  发布在  Mysql
关注(0)|答案(4)|浏览(134)

表1和样本数据:

CREATE TABLE student_p (
    ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    S_id INT UNSIGNED NOT NULL,
    Points DOUBLE NOT NULL,
    P_date DATE NOT NULL
);
INSERT INTO student_p VALUES
    (50055, 3330, 45, '2023-11-30'),
    (50056,  332, 43, '2013-10-31'),
    (50057, 3330, 22, '2013-10-30');

表2和样品数据:-

CREATE TABLE student_act (
    ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    s_id INT UNSIGNED NOT NULL,
    VIDEO_SCORE DOUBLE NOT NULL,
    EXERCISESCORE DOUBLE NOT NULL,
    DS INT NOT NULL,
    A_date DATE NOT NULL
);
INSERT INTO student_act VALUES
    (2333, 233, 22.43,  233.4455, 23, '2023-11-30'),
    (2334, 235, 24.566,      232, 34, '2023-10-31'),
    ( 322, 678, 23,           45, 23, '2022-10-30'),
    ( 433,  45, 23,           23, 43, '2022-10-01');

只有两个表中的id不能重复,其余所有数据都可以重复。
student_p表:-
| ID|S_id|点数|P_date|
| --------------|--------------|--------------|--------------|
| 五零零五|小行星333|四十五|2023-11-30 - 2023-11-30|
| 五零五六|三百三十二|四十三|2013-10-31 2013-10-31|
| 五零零七|小行星333|二十二|二〇一三年十月三十日|
student_act表:-
| ID|s_标识符|视频_评分|Lesson_score|DS|A_date|
| --------------|--------------|--------------|--------------|--------------|--------------|
| 小行星2333|二百三十三|二十二点四十三|二三三点四四五五|二十三|2023-11-30 - 2023-11-30|
| 小行星2334|二百三十五|二十四点五六六|二百三十二|三十四|2023-10-31 - 2023-10-31|
| 三二二|六百七十八|二十三|四十五|二十三|二〇二二年十月三十日|
| 四三三|四十五|二十三|二十三|四十三|二○二二年十月一日|
预期结果:-
| 年|月份|点数|视频_评分|Lesson_score|DS|
| --------------|--------------|--------------|--------------|--------------|--------------|
| 二零二三年|十一|四十五|二十二点四十三|二三三点四四五五|二十三|
| 二零二三年|10个|零|二十四点五六六|二百三十二|三十四|
| 2022|10个|零|四十六|六十八|六十六|
| 2013年|10个|六十五|零|零|零|

chhkpiq4

chhkpiq41#

在第二个查询中,您将按年份限制结果集。如果这是您想要做的,则应该基于日期执行,而不是基于日期的函数,因为它是non-sargable。这样就可以使用Hossam建议的索引。(过于)简单地说,HAVING子句是在完成所有工作之后应用的,而WHERE子句减少了正在完成的工作量:

SELECT
  YEAR(msa.DATE) AS Year,
  MONTH(msa.DATE) AS Month,
  SUM(msk.POINT) AS Ps,
  SUM(msa.VIDEO_SCORE) AS Video,
  SUM(msa.EXERCISESCORE) AS Lessons,
  SUM(msa.DS) AS DS
FROM student_p msk
RIGHT JOIN student_act msa ON msk.DATE = msa.DATE
WHERE msa.DATE >= MAKEDATE(2023, 1) AND msa.DATE < MAKEDATE(2024, 1)
GROUP BY `Year`, `Month`;

MAKEDATE(年,一年中的某一天)
返回一个日期,给定年份和年中的日期值。dayofyear必须大于0,否则结果为NULL。
在你的一条评论中,你说你想要一个full join,MySQL没有,但是你可以用一个(LEFT|RIGHT) JOINUNION和另一个(LEFT|RIGHT) JOIN实现同样的目标。

WITH st (`Year`, `Month`, `Ps`) AS (
    SELECT
        YEAR(`P_date`) AS `Year`,
        MONTH(`P_date`) AS `Month`,
        SUM(`Points`)
    FROM `student_p`
    -- WHERE `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
    GROUP BY `Year`, `Month`
),
act (`Year`, `Month`, `Video`, `Lessons`, `DS`) AS (
    SELECT
        YEAR(`A_date`) AS `Year`,
        MONTH(`A_date`) AS `Month`,
        SUM(`VIDEO_SCORE`) AS `Video`,
        SUM(`EXERCISESCORE`) AS `Lessons`,
        SUM(`DS`) AS `DS`
    FROM student_act
    -- WHERE A_date >= MAKEDATE(2023, 1) AND DATE < MAKEDATE(2024, 1)
    GROUP BY `Year`, `Month`
)
SELECT
    `Year`,
    `Month`,
    SUM(`Ps`) AS `Ps`,
    SUM(`Video`) AS `Video`,
    SUM(`Lessons`) AS `Lessons`,
    SUM(`DS`) AS `DS`
FROM (
    SELECT `act`.`Year`, `act`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
    FROM `st`
    RIGHT JOIN `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`

    UNION ALL

    SELECT `st`.`Year`, `st`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
    FROM `st`
    LEFT JOIN `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`
    WHERE `act`.`Year` IS NULL
) t
GROUP BY `Year`, `Month`
ORDER BY `Year` DESC, `Month` DESC;

输出:
| 年份|月份|Ps|视频|教训|DS|
| --------------|--------------|--------------|--------------|--------------|--------------|
| 二零二三年|十一|四十五|二十二点四十三|二三三点四四五五|二十三|
| 二零二三年|10个|空|二十四点五六六|二百三十二|三十四|
| 2022|10个|空|四十六|六十八|六十六|
| 2013年|10个|六十五|空|空|空|
两个CTE进行聚合。我保留了WHERE子句,但注解掉了。
UNION中的第一个查询是当前的RIGHT JOIN,它从右侧表中检索所有记录,无论它们在左手表中是否有关联行。第二个查询然后从左侧表中获取所有在右侧没有关联行的行。
下面是一个db<>fiddle,其中调整了无效日期和重复PK值。
对于MySQL 5.7,您必须在主查询中重复子查询,因为CTE对您不可用:

SELECT
    `Year`,
    `Month`,
    SUM(`Ps`) AS `Ps`,
    SUM(`Video`) AS `Video`,
    SUM(`Lessons`) AS `Lessons`,
    SUM(`DS`) AS `DS`
FROM (
    SELECT `act`.`Year`, `act`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
    FROM (
        SELECT
            YEAR(`P_date`) AS `Year`,
            MONTH(`P_date`) AS `Month`,
            SUM(`Points`) AS `Ps`
        FROM `student_p`
        -- WHERE `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    ) `st`
    RIGHT JOIN (
        SELECT
            YEAR(`A_date`) AS `Year`,
            MONTH(`A_date`) AS `Month`,
            SUM(`VIDEO_SCORE`) AS `Video`,
            SUM(`EXERCISESCORE`) AS `Lessons`,
            SUM(`DS`) AS `DS`
        FROM student_act
        -- WHERE A_date >= MAKEDATE(2023, 1) AND DATE < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    ) `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`

    UNION ALL

    SELECT `st`.`Year`, `st`.`Month`, `Ps`, `Video`, `Lessons`, `DS`
    FROM (
        SELECT
            YEAR(`P_date`) AS `Year`,
            MONTH(`P_date`) AS `Month`,
            SUM(`Points`) AS `Ps`
        FROM `student_p`
        -- WHERE `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    ) `st`
    LEFT JOIN (
        SELECT
            YEAR(`A_date`) AS `Year`,
            MONTH(`A_date`) AS `Month`,
            SUM(`VIDEO_SCORE`) AS `Video`,
            SUM(`EXERCISESCORE`) AS `Lessons`,
            SUM(`DS`) AS `DS`
        FROM student_act
        -- WHERE A_date >= MAKEDATE(2023, 1) AND DATE < MAKEDATE(2024, 1)
        GROUP BY `Year`, `Month`
    ) `act` ON `st`.`Year` = `act`.`Year` AND `st`.`Month` = `act`.`Month`
    WHERE `act`.`Year` IS NULL
) t
GROUP BY `Year`, `Month`
ORDER BY `Year` DESC, `Month` DESC;

输出:
| 年份|月份|Ps|视频|教训|DS|
| --------------|--------------|--------------|--------------|--------------|--------------|
| 二零二三年|十一|四十五|二十二点四十三|二三三点四四五五|二十三|
| 二零二三年|10个|空|二十四点五六六|二百三十二|三十四|
| 2022|10个|空|四十六|六十八|六十六|
| 2013年|10个|六十五|空|空|空|
db<>fiddle
这是你要找的吗?
UNION中的第二个查询可以替换为:

SELECT
        YEAR(`P_date`) AS `Year`,
        MONTH(`P_date`) AS `Month`,
        SUM(`Points`) AS `Ps`,
        NULL AS `Video`,
        NULL AS `Lessons`,
        NULL AS `DS`
    FROM `student_p`
    WHERE NOT EXISTS (
        SELECT 1
        FROM student_act
        WHERE A_date BETWEEN student_p.P_date - INTERVAL (DAY(student_p.P_date) - 1) DAY
                         AND LAST_DAY(student_p.P_date)
    )
    -- AND `P_date` >= MAKEDATE(2023, 1) AND `P_date` < MAKEDATE(2024, 1)
    GROUP BY `Year`, `Month`

您必须对这些查询变体和您的真实的数据进行试验,看看哪种效果最好。您还应该将O. Jones的建议也加入其中,看看它是否提高了聚合查询的性能。
db<>fiddle
如果此查询存在性能问题,请更新您的问题,包括完整查询和两个聚合子查询的EXPLAIN输出。

1yjd4xko

1yjd4xko2#

对于初学者,您可以尝试index student_act表上的DATE字段。索引有助于SQL引擎更快地找到基于该字段的记录。
我的SQL:

ALTER TABLE `student_act` ADD INDEX `date_index` (`DATE`)

索引优化了SQL引擎中的查找过程,如果查询匹配基于MULTIPLE字段的记录,或者像您的示例中一样,匹配的字段是非数字的,则通常非常有效。
另外,尝试将字段重命名为不同于DATE的名称,因为它在某些SQL引擎中是保留字。

rkue9o1l

rkue9o1l3#

你也可以试试这个

SELECT
  YEAR(msa.DATE) AS Year,
  MONTH(msa.DATE) AS Month,
  SUM(msk.POINT) AS Ps,
  SUM(msa.VIDEO_SCORE) AS Video,
  SUM(msa.EXERCISESCORE) AS Lessons,
  SUM(msa.DS) AS DS
FROM student_p msk
RIGHT JOIN student_act msa ON msk.DATE = msa.DATE
WHERE msa.DATE between '2023-01-01 00:00:00' AND '2023-12-31 23:59:59'
GROUP BY MONTH(msa.DATE);
0mkxixxg

0mkxixxg4#

不要使用YEAR()和MONTH()函数,尝试按LAST_DAY()分组。它会给出包含任何DATE、DATETIME或TIMESTAMP的月份的最后一天。
就像这样:

SELECT
  LAST_DAY(msa.DATE) AS MonthEnding,
  SUM(msk.POINT) AS Ps,
  SUM(msa.VIDEO_SCORE) AS Video,
  SUM(msa.EXERCISESCORE) AS Lessons,
  SUM(msa.DS) AS DS
FROM student_p msk
RIGHT JOIN student_act msa ON msk.DATE = msa.DATE
WHERE msa.DATE >= MAKEDATE(2023, 1) AND msa.DATE < MAKEDATE(2024, 1)
GROUP BY LAST_DAY(msa.DATE);

student_p(date, POINT)student_act(DATE, VIDEO_SCORE, EXERCISES_SCORE, DS)上分别创建一个覆盖索引。
这种形式的查询和那些索引将有很大的帮助。

相关问题