SQL Server Count distinct with two conditions

bjg7j2ky  于 2023-04-04  发布在  其他
关注(0)|答案(2)|浏览(128)

I want to count how many distinct ids have values of 'A' AND 'B' in column Code.
| ID | Code |
| ------------ | ------------ |
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 2 | C |

Data as fiddle

I tried using

SELECT COUNT(DISTINCT ID)
FROM tab
WHERE CODE = 'A' AND CODE = 'B';

and

SELECT COUNT(DISTINCT ID) 
FROM tab AS t1 
WHERE EXISTS (
        SELECT 1
        FROM tab 
        WHERE ID = t1.ID AND CODE = 'A' AND CODE = 'B'
        );

Both methods count 0

Why is that?

Thank you

6rvt4ljy

6rvt4ljy1#

If you want count for different codes:

SELECT COUNT(DISTINCT ID)
FROM tab
WHERE CODE IN ('A', 'B')
GROUP BY CODE ;

If you want common count of distinct ID paired with any of letters A or B use this:

SELECT COUNT(DISTINCT ID)
FROM tab
WHERE CODE IN ('A', 'B');
cwtwac6a

cwtwac6a2#

I want to count how many distinct id s have values 'A' and 'B' in column code .

I would recommend group by and having ; this gives you the list of ids that have both codes (assuming no duplicate id/code).

select id
from tab
where code in ('A', 'B')  -- rows that have any of the two codes
group by id
having count(*) = 2       -- both codes are present in the group

Now we can just count how many rows the above query returns to get the output you wanted:

select count(*) as cnt
from (
    select id
    from tab
    where code in ('A', 'B')
    group by id
    having count(*) = 2
) t

I would expect this approachto perform better than the solution using intersect , because it scans the table only once.

Note that it is also possible to use exists , as you intended in yuour second attempt (although that's one more table scan); we would just need to fix the filtering:

select count(*) 
from tab as t1 
where t1.code = 'A' and exists (
    select 1
    from tab t2
    where t2.id = t1.id and t2.code = 'B'
)

This selects rows of code A and ensure that the given id has another row with code B.

相关问题