MYSQL -在总体水平上,每个产品的第二封和第三封电子邮件之间的平均时间是多少?

dgsult0t  于 2023-11-16  发布在  Mysql
关注(0)|答案(2)|浏览(107)

我们有4个表,以研究广告系列的电子邮件性能:

  • 表A包含活动名称信息
  • 表B包含活动ID交付性能(活动发送对象)
  • 表C包含活动ID打开性能(谁打开了电子邮件)
  • 表D包含了广告系列的点击表现(谁点击了电子邮件中的任何元素)

表结构的示例(注意:假设这是快照而不是完整的数据集):

-- Table A: Campaign Information
CREATE TABLE TableA (
campaign_name VARCHAR(255),
campaign_id INT
);

-- Table B: Email Delivery Information
CREATE TABLE TableB (
campaign_id INT,
delivery_date DATE,
user_id VARCHAR(50)
);

-- Table C: Email Open Information
CREATE TABLE TableC (
campaign_id INT,
open_date DATE,
user_id VARCHAR(50)
);

-- Table D: Email Click Information
CREATE TABLE TableD (
campaign_id INT,
click_date DATE,
user_id VARCHAR(50)
);

-- Inserts for TableA (Campaign Information)
INSERT INTO TableA (campaign_name, campaign_id) VALUES
('Instacash Promo 1', 112233),
('Instacash Promo 2', 112244),
('RoarMoney Balance 5', 112259);

-- Inserts for TableB (Email Delivery Information)
INSERT INTO TableB (campaign_id, delivery_date, user_id) VALUES
(112233, '2021-01-01', 'a'),
(112233, '2021-01-01', 'b'),
(112233, '2021-01-01', 'c'),
(112244, '2021-01-05', 'd'),
(112244, '2021-01-05', 'e');

-- Inserts for TableC (Email Opened Information)
INSERT INTO TableC (campaign_id, open_date, user_id) VALUES
(112233, '2021-01-03', 'a'),
(112233, '2021-01-05', 'b'),
(112244, '2021-01-07', 'd'),
(112244, '2021-01-10', 'e');

-- Inserts for TableD (Email Link Clicked Information)
INSERT INTO TableD (campaign_id, click_date, user_id) VALUES
(112233, '2021-01-03', 'a'),
(112244, '2021-01-11', 'e');

字符串
我需要写一个SQL查询之间的平均时间第二和第三电子邮件为每个单独的产品在整体水平?输出应包括:
| 活动类别|收到第二至第三封电子邮件的平均时间|
| --|--|
| | |
| | |
campaign_name应该被归类为Instacash、RoarMoney的campaign_category,或者任何不在Instacash或RoarMoney下的活动都可以被称为“其他”。
这是我的代码,但我认为下面的代码是错误的:

-- Calculate DATEDIFF between rn = 3 and rn = 2
WITH temp as (
    SELECT 
        CASE 
            WHEN campaign_name REGEXP '^Instacash' THEN 'Instacash'
            WHEN campaign_name REGEXP '^RoarMoney' THEN 'RoarMoney'
            ELSE 'Others'
        END AS campaign_category,
        a.campaign_id,
        a.campaign_name,
        b.delivery_date
    FROM TableA a
    LEFT JOIN TableB b ON a.campaign_id = b.campaign_id
),
temp_1 as (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY campaign_name ORDER BY delivery_date) as rn
    FROM temp
),
temp_2 as (
    SELECT *
    FROM temp_1
    WHERE rn in (2, 3)
),
temp_3 as (
    SELECT
        t1.campaign_category,
        t1.campaign_id,
        t1.campaign_name,
        DATEDIFF(day, t2.delivery_date, t3.delivery_date) as date_difference
    FROM temp_2 t1
    JOIN temp_2 t2 ON t1.campaign_id = t2.campaign_id AND t2.rn = 2
    JOIN temp_2 t3 ON t1.campaign_id = t3.campaign_id AND t3.rn = 3
)

-- Select the calculated result
SELECT * FROM temp_3;


有人能帮我吗?

inkz8wg9

inkz8wg91#

您的REGEXP for campaign_name似乎有点过多,可以用LIKE代替。如果您想要每个广告系列类别的第2和第3封电子邮件之间的平均天数,那么您可以使用这样的东西:

WITH temp as (
    SELECT 
        CASE 
            WHEN a.campaign_name LIKE 'Instacash%' THEN 'Instacash'
            WHEN a.campaign_name LIKE 'RoarMoney%' THEN 'RoarMoney'
            ELSE 'Others'
        END AS campaign_category,
        a.campaign_id,
        a.campaign_name,
        b.delivery_date,
        ROW_NUMBER() OVER (PARTITION BY a.campaign_id ORDER BY b.delivery_date) as rn
    FROM TableA a
    INNER JOIN TableB b ON a.campaign_id = b.campaign_id
)
SELECT
    t1.campaign_category,
    AVG(DATEDIFF(t2.delivery_date, t1.delivery_date)) as date_difference
FROM temp t1
JOIN temp t2 ON t1.campaign_id = t2.campaign_id AND t2.rn = 3
WHERE t1.rn = 2
GROUP BY t1.campaign_category;

字符串
注意事项:我在CTE中将分区更改为campaign_id而不是campaign_name,并将GROUP BY campaign_category添加到最终选择中以获得每个campaign_category的平均值。
这里是db<>fiddle

lg40wkob

lg40wkob2#

WITH temp as (
SELECT 
    CASE 
        WHEN a.campaign_name LIKE 'Instacash%' THEN 'Instacash'
        WHEN a.campaign_name LIKE 'RoarMoney%' THEN 'RoarMoney'
        ELSE 'Others'
    END AS campaign_category,
    a.campaign_id,
    a.campaign_name,
    b.delivery_date
FROM TableA a
INNER JOIN TableB b ON a.campaign_id = b.campaign_id),
temp_1 as (SELECT *, 
ROW_NUMBER() OVER (PARTITION BY campaign_category ORDER BY delivery_date) as rn FROM temp)
SELECT t1.campaign_category,
AVG(datediff(t2.delivery_date,t1.delivery_date)) as Average_time_taken_to_2nd_to_3rd_email 
FROM temp_1 t1 
JOIN temp_1 t2 ON t1.campaign_id = t2.campaign_id 
AND t2.rn = 3 WHERE t1.rn = 2 
GROUP BY t1.campaign_category;

字符串
我认为这段代码更接近问题的要求。谢谢@user1191247。

相关问题