postgresql SQL连接查询将每个结果返回三次

ftf50wuq  于 2023-10-18  发布在  PostgreSQL
关注(0)|答案(2)|浏览(123)

我有两张table,TX和最早的。我尝试根据“UserId”和“AcceptorCode”加入它们,并且只返回tx中的“CreatedOn”大于或等于earliest中的“EarliestPayDate”的记录。

WITH tx AS
(
    SELECT 
        "AcceptorCode", 
        "UserId", 
        "CreatedOn",
        "Amount",
        "WithdrawType",
        "DepositType",
        "Balance"
    FROM 
        "public"."Transactions"
    WHERE 
       -- "AcceptorCode" IS NOT NULL AND 
       -- "AcceptorCode" != ''
    "CreatedOn" BETWEEN {{START_DATE}} AND {{END_DATE}}
), 
earliest AS
(
    SELECT 
        "AcceptorCode", 
        "UserId", 
        MIN(CAST("CreatedOn" AS date)) AS "EarliestPayDate"
    FROM 
        tx
    WHERE 
        "AcceptorCode" IN ('000000014578478', '000000900030378', '000000005227597')
    GROUP BY 
        "AcceptorCode", 
        "UserId"
), 
wages AS 
(
    SELECT 
        e."UserId", t.*
    FROM 
        tx t
    JOIN 
        earliest e ON t."UserId" = e."UserId" AND (t."AcceptorCode" IS NULL OR t."AcceptorCode" = '')
    WHERE 
        t."CreatedOn" >= e."EarliestPayDate"
    ORDER BY 
        e."UserId" DESC   
)
SELECT *
FROM wages

问题是每个结果返回三次。

如何修改查询以使每个结果只返回一次?

编辑

使用SELECT DISTINCT "UserId"可以解决上面查询的问题。但是,当我使用聚合和GROUP BY时,它会将数量相加三次。

WITH tx AS
(
    ...
), 
earliest AS
(
    ...
), 
wages AS 
(
    SELECT 
        e."UserId", 
        SUM(CASE WHEN t."WithdrawType" IN (2, 3) THEN t."Amount" ELSE 0 END) AS "TotalWithdrawalWage",
        SUM(CASE WHEN t."DepositType" = 4 THEN t."Amount" ELSE 0 END) AS "TotalDepositWage"
    FROM 
        tx t
    JOIN 
        earliest e ON t."UserId" = e."UserId"
    WHERE 
        t."CreatedOn" >= e."EarliestPayDate"
    GROUP BY 
        e."UserId"
    ORDER BY 
        e."UserId" DESC   
)
SELECT *, "TotalWithdrawalWage" - "TotalDepositWage" AS "Difference"
FROM wages
WHERE "TotalWithdrawalWage" > 0 OR "TotalDepositWage" > 0
ORDER BY "UserId" DESC

myzjeezk

myzjeezk1#

每个用户都有3个副本的原因是因为在earliest AS(...),定义中,您按AcceptorCode + UserId分组。所以你将是每个AcceptorCode + UserId组合中最小的CreatedOn。当它被传递到周围的SELECT并再次与tx连接时,使用t."UserId" = e."UserId",每个t."UserId"将匹配3个e."UserId"值中的每一个,因为e."AcceptorCode"上没有连接条件。我不知道您想要实现什么,因为您正在寻找t."AcceptorCode" IS NULL OR t."AcceptorCode" = ''--似乎没有明确的意图要连接te数据集
如果您只是想获取每个用户的EarliestPayDate,那么您可以将earliest AS(...)定义更改为:

earliest AS(
    SELECT 
        "UserId", 
        MIN(CAST("CreatedOn" AS date)) AS "EarliestPayDate"
    FROM 
        tx
    WHERE 
    "AcceptorCode" IN (
    '000000014578478','000000900030378','000000005227597' )
    GROUP BY 
        "UserId"
),

并将AND (t."AcceptorCode" IS NULL OR t."AcceptorCode" = '')放在WHERE之后

elcex8rz

elcex8rz2#

在子查询earliest中,每个userId都有3条记录-每个AcceptorCode都有一条记录,因为你是按这两个字段分组的。
我想你只需要按userId分组:

earliest AS(
    SELECT 
        "UserId", 
        MIN(CAST("CreatedOn" AS date)) AS "EarliestPayDate"
    FROM 
        tx
    WHERE 
    "AcceptorCode" IN (
    '000000014578478','000000900030378','000000005227597' )
    GROUP BY 
        "UserId"
)

相关问题