mysql 对GROUP_CONCAT列进行计数以检查重复项

mtb9vblg  于 2023-04-19  发布在  Mysql
关注(0)|答案(3)|浏览(184)

我有以下查询:

SELECT
    cpe.entity_id AS configurable_product_id,
    cpe.sku AS configurable_product_sku,
    GROUP_CONCAT(DISTINCT cpsa.attribute_id ORDER BY cpsa.attribute_id SEPARATOR ',') AS configurable_super_attribute_ids,
    cp.entity_id AS simple_product_id,
    cp.sku AS simple_product_sku,
    GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values
FROM
    catalog_product_entity AS cpe
INNER JOIN catalog_product_super_link AS cpsl ON cpsl.parent_id = cpe.entity_id
INNER JOIN catalog_product_entity AS cp ON cp.entity_id = cpsl.product_id
INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
GROUP BY cpe.entity_id, cp.entity_id
LIMIT 100;

结果是这样的:

configurable_product_id configurable_product_sku    configurable_super_attribute_ids    simple_product_id   simple_product_sku  simple_product_super_attribute_values
90086   Iphone  93,165  119 093 93_5730,165_7070
90086   Iphone  93,165  124 104 93_5730,165_7071
90086   Iphone  93,165  128 114 93_5707,165_7072
90086   Iphone  93,165  169 156 93_5727,165_7072
90086   Iphone  93,165  181 163 93_5730,165_7070
90086   Iphone  93,165  186 194 93_5727,165_7071
90086   Iphone  93,165  146 023 93_5730,165_7071

我希望查询只显示simple_product_super_attributes_values列中重复的记录。
因此,在这种情况下,丢弃第3、第4和第6行,并保留其余部分:

configurable_product_id configurable_product_sku    configurable_super_attribute_ids    simple_product_id   simple_product_sku  simple_product_super_attribute_values
90086   Iphone  93,165  119 093 93_5730,165_7070
90086   Iphone  93,165  124 104 93_5730,165_7071
90086   Iphone  93,165  181 163 93_5730,165_7070
90086   Iphone  93,165  146 023 93_5730,165_7071

如何修改我的查询以获得我的结果?我尝试对group_concat进行计数,但这是不允许的。
我尝试通过Having语句为该列添加一个计数:

HAVING COUNT(*) > 1 
AND simple_product_super_attribute_values IN (
    SELECT simple_product_super_attribute_values
    FROM (
        SELECT
            cpe.entity_id AS configurable_product_id,
            GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values
        FROM
            catalog_product_entity AS cp
        INNER JOIN catalog_product_super_link AS cpsl ON cpsl.product_id = cp.entity_id
        INNER JOIN catalog_product_entity AS cpe ON cpe.entity_id = cpsl.parent_id
        INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
        INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
        GROUP BY cp.entity_id, cpe.entity_id
        HAVING COUNT(*) > 1
    ) AS subquery
    WHERE subquery.configurable_product_id = cpe.entity_id
)

但这说不通。

ftf50wuq

ftf50wuq1#

HAVING子句用于执行您需要的操作:

HAVING COUNT(DISTINCT whateverid) > 1

这是在HAVING子句中可以做的几件很酷的事情之一。

mo49yndu

mo49yndu2#

您可以在CTE中使用查询,然后在第二个CTE中使用count() over simple_product_super_attribute_values的窗口函数来检查重复项:

WITH cte as (
  SELECT
    cpe.entity_id AS configurable_product_id,
    cpe.sku AS configurable_product_sku,
    GROUP_CONCAT(DISTINCT cpsa.attribute_id ORDER BY cpsa.attribute_id SEPARATOR ',') AS configurable_super_attribute_ids,
    cp.entity_id AS simple_product_id,
    cp.sku AS simple_product_sku,
    GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values
  FROM
    catalog_product_entity AS cpe
  INNER JOIN catalog_product_super_link AS cpsl ON cpsl.parent_id = cpe.entity_id
  INNER JOIN catalog_product_entity AS cp ON cp.entity_id = cpsl.product_id
  INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
  INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
  GROUP BY cpe.entity_id, cp.entity_id
  LIMIT 100
),
cte2 as (
  select configurable_product_id,
         configurable_product_sku,
         configurable_super_attribute_ids,
         simple_product_super_attribute_values,
         count(1) over( partition by simple_product_super_attribute_values) as count_dups
         from mytable
)
select *
from cte2
where count_dups > 1
kpbwa7wx

kpbwa7wx3#

HAVING是一个类似于WHERE的子句,但是出现在WHERE和group by之后。但是测试count(*)没有帮助。
你想使用计数窗口函数,通过simple_product_super_attribute_values进行分区。假设之前,你会做:

select
    configurable_product_id,
    configurable_product_sku,
    configurable_super_attribute_ids,
    simple_product_id,
    simple_product_sku,
    simple_product_super_attribute_values
from (
    SELECT
        cpe.entity_id AS configurable_product_id,
        cpe.sku AS configurable_product_sku,
        GROUP_CONCAT(DISTINCT cpsa.attribute_id ORDER BY cpsa.attribute_id SEPARATOR ',') AS configurable_super_attribute_ids,
        cp.entity_id AS simple_product_id,
        cp.sku AS simple_product_sku,
        GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values,
        count(*) over (partition by GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',')) > 1 duplicate_simple_product_super_attribute_values
    FROM
        catalog_product_entity AS cpe
    INNER JOIN catalog_product_super_link AS cpsl ON cpsl.parent_id = cpe.entity_id
    INNER JOIN catalog_product_entity AS cp ON cp.entity_id = cpsl.product_id
    INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
    INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
    GROUP BY cpe.entity_id, cp.entity_id
) as products
where duplicate_simple_product_super_attributes_values
limit 100

相关问题