mysql 计数按枚举分组,包括计数为0的可能枚举值

fnx2tebb  于 2023-01-25  发布在  Mysql
关注(0)|答案(6)|浏览(128)

我有一个项目表。其中一个字段是类别(用枚举表示)。有些类别没有项目。
所以我就这么做了:

select category, count(*) as total from items group by category;
+------------+-------+
| category   | total |
+------------+-------+
| one        |  6675 |
+------------+-------+

我想生成一个如下所示的表(其中two是另一个可能的枚举值):

+------------+-------+
| category   | total |
+------------+-------+
| one        |  6675 |
+------------+-------+
| two        |  0    |
+------------+-------+

我如何用mysql SQL查询来实现这一点?

8qgya5xd

8qgya5xd1#

枚举数据类型 * 通常 * 适用于可能的选项(值)不太多(首选〈= 10)的情况,并且您不会在将来添加新选项(至少不会经常添加)。因此,性别是枚举的一个很好的用例:(m, f, n)。在您的情况下,拥有一个包含所有可能类别的Master表(而不是对它们使用Enum)通常会更好。这样,从Master表执行LEFT JOIN就更容易了。
但是,正如你所问:
解决方案使用枚举类型生成表,并包含0项

    • 适用于所有MySQL/MariaDB版本**:

我们需要从INFORMATION_SCHEMA.COLUMNS中获取所有可能的枚举值列表:

SELECT
   SUBSTRING(COLUMN_TYPE, 6, CHAR_LENGTH(COLUMN_TYPE) - 6) AS enum_values
FROM
    information_schema.COLUMNS
WHERE
    TABLE_NAME = 'items'        -- your table name
AND
    COLUMN_NAME = 'category'    -- name of the column
AND 
    TABLE_SCHEMA = 'your_db'    -- name of the database (schema)

但是,这个查询将以逗号分隔的字符串形式提供所有枚举值,如下所示:
'one','two','three','four'
现在,我们需要将这个字符串转换成多个行。为了实现这一点,我们可以使用Sequence(Number series)表。您可以在数据库中定义一个永久表,存储从1到100的整数(您可能会发现这个表在许多其他情况下也很有用)(或者,另一种方法是使用Derived Table-检查这个来获得一个想法:https://stackoverflow.com/a/58052199/2469308)。

CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL, PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1), (2), ...... , (99), (100);

现在,我们将根据逗号的位置在"枚举值字符串"和seq表之间执行一个JOIN,以便将枚举值提取到不同的行中。(逗号)要提取枚举值,我们将使用','(以避免值字符串中可能有逗号的情况)。字符串操作使用Substring_Index()Trim()``Char_Length()等函数可以用来提取枚举值,你可以查看answer来了解这个技术:

CREATE TABLE items 
(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 category ENUM('one','two','three','four'), 
 item_id INT UNSIGNED) ENGINE=InnoDB;

INSERT INTO items (category, item_id) 
VALUES ('one', 1), 
       ('two', 2), 
       ('one', 2), 
       ('one', 3);

CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL, 
                  PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1),(2),(3),(4),(5);
    • 查询#1**
SELECT Trim(BOTH '\'' FROM Substring_index(Substring_index(e.enum_values,
                                           '\',\'',
                                                  seq.n),
                                  '\',\'', -1)) AS cat
FROM   (SELECT Substring(column_type, 6, Char_length(column_type) - 6) AS
               enum_values
        FROM   information_schema.columns
        WHERE  table_name = 'items'
               AND column_name = 'category'
               AND table_schema = 'test') AS e
       JOIN seq
         ON ( Char_length(e.enum_values) - Char_length(REPLACE(e.enum_values,
                                                       '\',\'',
                                                       ''))
            ) / 3 >= seq.n - 1

| cat   |
| ----- |
| one   |
| two   |
| three |
| four  |

现在,困难的部分已经完成,我们需要做的就是从这个子查询(包含所有类别枚举值)到items表执行一个LEFT JOIN,以获取每个类别的Count。
最终查询如下所示(View on DB Fiddle):

SELECT all_cat.cat                   AS category,
       Count(i.item_id) AS total
FROM   (SELECT Trim(BOTH '\'' FROM Substring_index(
                                   Substring_index(e.enum_values,
                                          '\',\'',
                                                  seq.n),
                                                  '\',\'', -1)) AS cat
        FROM   (SELECT Substring(column_type, 6, Char_length(column_type) - 6)
                       AS
                       enum_values
                FROM   information_schema.columns
                WHERE  table_name = 'items'
                       AND column_name = 'category'
                       AND table_schema = 'test') AS e
               JOIN seq
                 ON ( Char_length(e.enum_values) - Char_length(
                                                   REPLACE(e.enum_values,
                                                   '\',\'',
                                                   ''))
                    ) / 3 >= seq.n - 1) AS all_cat
       LEFT JOIN items AS i
              ON i.category = all_cat.cat
GROUP  BY all_cat.cat
ORDER  BY total DESC;
    • 结果**
| category | total |
| -------- | ----- |
| one      | 3     |
| two      | 1     |
| three    | 0     |
| four     | 0     |
jw5wzhpr

jw5wzhpr2#

下面是使用MySQL 8.0和JSON_TABLE()的一些乐趣:

select c.category, count(i.category) as total
from information_schema.COLUMNS s
join json_table(
  replace(replace(replace(trim('enum' from s.COLUMN_TYPE),'(','['),')',']'),'''','"'),
  '$[*]' columns (category varchar(50) path '$')
) c
left join items i on i.category = c.category
where s.TABLE_SCHEMA = 'test' -- replace with your db/schema name
  and s.TABLE_NAME   = 'items'
  and s.COLUMN_NAME  = 'category'
group by c.category

它将ENUM类型定义从information_schema转换为JSON数组,然后由JSON_TABLE()转换为一个表,然后可以将该表用于LEFT JOIN。
参见db-fiddle上的演示
注意:类别中不应包含任何()[]'"字符。
但是说真的--只需要创建categories表就行了。这样做的原因还有很多。例如,您可能希望呈现一个包含所有可能类别的下拉菜单。

select category from categories
sczxawaw

sczxawaw3#

我认为将枚举编码到脚本中基本上是一种糟糕的做法,因此,创建一个包含枚举(及其相对键)的表,然后就可以简单地对左连接查询进行分组...

SELECT 
    cat.enum_name, 
    COUNT(data.id) AS total 
FROM
    category_table cat
    LEFT JOIN
    data_table data
        ON  cat.cate_id = data.cat_id 
 GROUP BY 
    cat.enum_name
uinbv5nw

uinbv5nw4#

使用in-select子查询

select  cat.categoryname
   (
      select count(*)  -- count total
      from items as i
      where i.category = cat.category  -- connect

   ) as totalcount
from cat
order by cat.categoryname
jhdbpxl9

jhdbpxl95#

您可以创建一个不同类别的虚构数据集,并与原始表进行左连接,如下所示。

SELECT A.category, count(*) total FROM
 (SELECT 'one' as Category 
 UNION ALL
 SELECT 'two' as Category) A
 LEFT JOIN items B
 ON A.Category=B.Category
 GROUP BY B.Category;

如果您希望动态地获得所有类别的列表,则将它们保存在另一个表中(例如All_category_table),然后执行如下所示的连接:

SELECT A.category, count(*) total FROM
 (SELECT Category FROM All_category_table) A
 LEFT JOIN items B
 ON A.Category=B.Category
 GROUP BY B.Category;
j2datikz

j2datikz6#

此答案适用于没有其他表保存可能的类别值的情况。

假设您有一个名为real_table的表,其中包含一个 * not null & value constrained * 列category,在该列中,理论上可以遇到5个不同的值:'CATEGORY_0', 'CATEGORY_1', 'CATEGORY_2', 'CATEGORY_3', 'CATEGORY_4'

CREATE TABLE real_table
(
    id       VARCHAR(255) NOT NULL
        PRIMARY KEY,
    category VARCHAR(255) NOT NULL
        CONSTRAINT category_in CHECK (
                category in ('CATEGORY_0',
                             'CATEGORY_1',
                             'CATEGORY_2',
                             'CATEGORY_3',
                             'CATEGORY_4')
            )
);

但表中的实际数据集不包含任何值为'CATEGORY_0'的行。因此,当您运行以下查询时:

SELECT real_table.category AS category, COUNT(*) AS cnt
FROM real_table
GROUP BY real_table.category;

你会看到,你得到这样的结果:
| 范畴|碳纳米管|
| - ------|- ------|
| 类别_1|一百五十|
| 类别_2|二十个|
| 类别_3|十二|
| 类别_4|1个|

  • 省略了'CATEGORY_0',不太好 *

由于您的类别没有其他表支持,因此您必须创建一个可能类别的人工数据集,如下所示:

SELECT 'CATEGORY_0' AS category_entry
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry;

您可以在原始查询中将其用作表,以便对以下各项进行右联接:

SELECT all_categories.category_entry AS category,
       COUNT(real_table.id)          AS cnt -- important to count some non-null value, such as PK of the real_table
FROM real_table
         RIGHT JOIN
     (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
      UNION ALL
      SELECT 'CATEGORY_1' AS category_entry
      UNION ALL
      SELECT 'CATEGORY_2' AS category_entry
      UNION ALL
      SELECT 'CATEGORY_3' AS category_entry
      UNION ALL
      SELECT 'CATEGORY_4' AS category_entry) all_categories
     ON real_table.category = all_categories.category_entry
GROUP BY all_categories.category_entry;

现在,当您运行查询时,应该会得到所需的输出:
| 范畴|碳纳米管|
| - ------|- ------|
| * * 类别_0**|* * 0**|
| 类别_1|一百五十|
| 类别_2|二十个|
| 类别_3|十二|
| 类别_4|1个|

  • 'CATEGORY_0'现在包含零美分。不错。*

现在让我们假设category列是not*not null * 约束的,并且还可能包括一些其他意外的category值(例如'CATEGORY_66'):

CREATE TABLE real_table
(
    id       VARCHAR(255) NOT NULL
        PRIMARY KEY,
    category VARCHAR(255) -- nullable and no constraint for valid values
);

我们希望在结果集中也包含这些 * null * 和 * unexpected * category计数,然后我们必须以不同的方式准备可能类别的人工数据集:

SELECT DISTINCT all_categories.category_entry
FROM (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
      UNION ALL
      SELECT 'CATEGORY_1' AS category_entry
      UNION ALL
      SELECT 'CATEGORY_2' AS category_entry
      UNION ALL
      SELECT 'CATEGORY_3' AS category_entry
      UNION ALL
      SELECT 'CATEGORY_4' AS category_entry
      UNION ALL
      SELECT DISTINCT category
      FROM real_table AS category_entry) all_categories;

并像以前一样使用它:

SELECT distinct_categories.category_entry AS category,
       COUNT(real_table.id)               AS cnt -- important to count some non-null value, such as PK of the real_table
FROM real_table
         RIGHT JOIN
     (SELECT DISTINCT all_categories.category_entry
      FROM (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
            UNION ALL
            SELECT 'CATEGORY_1' AS category_entry
            UNION ALL
            SELECT 'CATEGORY_2' AS category_entry
            UNION ALL
            SELECT 'CATEGORY_3' AS category_entry
            UNION ALL
            SELECT 'CATEGORY_4' AS category_entry
            UNION ALL
            SELECT DISTINCT category
            FROM real_table AS category_entry) all_categories) distinct_categories
     ON real_table.category = distinct_categories.category_entry
GROUP BY distinct_categories.category_entry;

现在,当您运行查询时,输出还应该包括其他类别和 * null * 类别的计数
| 范畴|碳纳米管|
| - ------|- ------|
| 类别_0|无|
| 类别_1|一百五十|
| 类别_2|二十个|
| 类别_3|十二|
| 类别_4|1个|
| * * 类别_66**|* * 十三**|
| | * * 10个**|

  • 结果集中现在包括 * unexpected * 'CATEGORY_66'(包含13个条目)和 * null * category(包含10个条目)*

我不能保证提供的查询的性能-更有经验的人可能会权衡这一点?

相关问题