SQL Server Counting records based on conditions over potential null values

h6my8fg2  于 2023-06-04  发布在  其他
关注(0)|答案(1)|浏览(151)

I have this tables:

CREATE TABLE customer_orders (
  "order_id" INTEGER,
  "customer_id" INTEGER,
  "pizza_id" INTEGER,
  "exclusions" VARCHAR(4),
  "extras" VARCHAR(4),
  "order_time" TIMESTAMP
);
INSERT INTO customer_orders
  ("order_id", "customer_id", "pizza_id", "exclusions", "extras", "order_time")
VALUES
  ('1', '101', '1', '', '', '2020-01-01 18:05:02'),
  ('2', '101', '1', '', '', '2020-01-01 19:00:52'),
  ('3', '102', '1', '', '', '2020-01-02 23:51:23'),
  ('3', '102', '2', '', NULL, '2020-01-02 23:51:23'),
  ('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
  ('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
  ('4', '103', '2', '4', '', '2020-01-04 13:23:46'),
  ('5', '104', '1', NULL, '1', '2020-01-08 21:00:29'),
  ('6', '101', '2', NULL, NULL, '2020-01-08 21:03:13'),
  ('7', '105', '2', NULL, '1', '2020-01-08 21:20:29'),
  ('8', '102', '1', NULL, NULL, '2020-01-09 23:54:33'),
  ('9', '103', '1', '4', '1, 5', '2020-01-10 11:22:59'),
  ('10', '104', '1', NULL, NULL, '2020-01-11 18:34:49'),
  ('10', '104', '1', '2, 6', '1, 4', '2020-01-11 18:34:49');

CREATE TABLE runner_orders (
  "order_id" INTEGER,
  "runner_id" INTEGER,
  "pickup_time" VARCHAR(19),
  "distance" DECIMAL(5,2) NULL,
  "duration" VARCHAR(10),
  "cancellation" VARCHAR(23)
);
INSERT INTO runner_orders
  ("order_id", "runner_id", "pickup_time", "distance", "duration", "cancellation")
VALUES
  ('1', '1', '2020-01-01 18:15:34', '20', '32', ''),
  ('2', '1', '2020-01-01 19:10:54', '20', '27', ''),
  ('3', '1', '2020-01-03 00:12:37', '13.4', '20', NULL),
  ('4', '2', '2020-01-04 13:53:03', '23.4', '40', NULL),
  ('5', '3', '2020-01-08 21:10:57', '10', '15', NULL),
  ('6', '3', NULL, NULL, NULL, 'Restaurant Cancellation'),
  ('7', '2', '2020-01-08 21:30:45', '25', '25mins', NULL),
  ('8', '2', '2020-01-10 00:15:02', '23.4', '15', NULL),
  ('9', '2', NULL, NULL, NULL, 'Customer Cancellation'),
  ('10', '1', '2020-01-11 18:50:20', '10', '10', NULL);

And I they gave me the question: For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

Now, I tried to do this:

WITH cte_1
AS (
    SELECT co.customer_id
        , co.order_id
        , co.exclusions
        , isnull(co.exclusions,'') AS exc
        , isnull(co.extras,'') AS ext
    FROM customer_orders as co
    INNER JOIN runner_orders as ru 
        ON co.order_id = ru.order_id
    WHERE NOT ru.cancellation IN ('Restaurant Cancellation', 'Customer Cancellation')    
)    
SELECT customer_id
    , COUNT(order_id) AS no_changes 
FROM cte_1
WHERE exc = '' AND ext = ''
GROUP BY customer_id

As you can see this so long but at least works and the numbers are correct. But then when I try to do the same to get the result for pizzas with changes like this:

WITH cte_1
AS (
    SELECT co.customer_id
        , co.order_id
        , co.exclusions
        , isnull(co.exclusions,'') AS exc
        , isnull(co.extras,'') AS ext
    FROM customer_orders as co
    INNER JOIN runner_orders as ru 
        ON co.order_id = ru.order_id
    WHERE NOT ru.cancellation IN ('Restaurant Cancellation', 'Customer Cancellation')    
)    
SELECT customer_id
    , COUNT(order_id) AS with_changes 
FROM cte_1
WHERE not exc = '' 
GROUP BY customer_id
UNION
SELECT customer_id
    , COUNT(order_id) AS with_changes 
FROM cte_1
WHERE not ext = '' 
GROUP BY customer_id

It doesn't work, and I'm not sure if it's because of the Nulls and blanks or the approach in general. Please if there is a better and shorter way to do this.

EXPECTED RESULT would be something like:
| customer_id | orders_with_changes | orders_with_NO_changes |
| ------------ | ------------ | ------------ |
| 101 | --- | 2 |
| 102 | --- | 3 |
| 103 | 3 | --- |
| 104 | 2 | 1 |
| 105 | 1 | --- |

41ik7eoe

41ik7eoe1#

You can try approaching this problem by:

  • filtering on the runners table with the NOT EXISTS operator, to check and discard the orders that were subject to alteration by third-party, before any aggregation is carried out
  • applying conditional aggregation with COUNT + CASE expression, by counting records that satisfy conditions (orders without changes are the ones that have both extras and exclusions null, while order with changes are identified by the opposite condition)
SELECT customer_id, 
       COUNT(CASE WHEN NOT COALESCE(extras, '') = '' 
                    OR NOT COALESCE(exclusions, '') = '' THEN 1 END) AS num_with_changes,
       COUNT(CASE WHEN COALESCE(extras, '') = ''
                   AND COALESCE(exclusions, '') = '' THEN 1 END) AS num_with_no_changes
FROM customer_orders co
WHERE NOT EXISTS(SELECT 1 
                 FROM runner_orders ro 
                 WHERE co.order_id = ro.order_id 
                   AND NOT COALESCE(cancellation, '') = '')
GROUP BY customer_id

Output:

customer_idnum_with_changesnum_with_no_changes
10102
10203
10330
10421
10510

Check the demo here .

相关问题