sql:在sql中相交(连接概率)

iq3niunx  于 2021-08-01  发布在  Java
关注(0)|答案(1)|浏览(338)

我正在努力解决一个sql问题,我就是搞不懂。我得数一下喜欢两种食物的人数。
我有一个表显示用户的id,还有一个列指示他们喜欢什么类型的食物。ID可能是重复的,因为一个人可能喜欢不止一种食物。我还有一个显示用户ID的表格和一个列,显示他们喜欢什么类型的饮料。同样,用户id可能是重复的。
有了这两个表,我必须创建一个表来计算喜欢某一类型食物的人数,计算喜欢某一类型饮料的人数,并计算喜欢这一对的人数。下面是我期望得到的一个例子:

foods .  --------- drinks --------- count_food ------- count_drink -------count_combination
hamburger          coke .              17 .              67                  21
pizza              coke .              45                67                  8
chicken            coke                21                67                  25

到目前为止,我已经创建了foods,drinks,count\u food和count\u drink列。我使用交叉连接函数创建foods and drinks列,并使用count函数填充count\u food and count drinks列。然而,我被困在计数组合列。我只是不知道怎么做。这可以通过交叉验证连接来完成吗?
感谢您的帮助:)

prdp8dxp

prdp8dxp1#

既然你在这个问题上提供了你想要的结果,我就用示例数据重新创建了你的案例。
为了达到最终的结果,我已经完成了你提到的所有步骤:数一数只喜欢一种饮料的人,数一数只喜欢一种食物的人,然后数一数喜欢这种饮料和食物的人。
下面是示例数据和我采取的步骤,


# sample data for food

WITH food AS(
SELECT 1 AS user_ids, "hamburguer" AS foods UNION ALL
SELECT 1 AS user_ids, "hamburguer" AS foods UNION ALL
SELECT 2 AS user_ids, "hamburguer" AS foods UNION ALL
SELECT 2 AS user_ids, "pizza" AS foods UNION ALL
SELECT 2 AS user_ids, "pizza" AS foods UNION ALL
SELECT 3 AS user_ids, "chicken" AS foods 
),

# sample data for drink

drink AS(
SELECT 1 AS user_ids, "coke" AS drinks UNION ALL
SELECT 2 AS user_ids, "coke" AS drinks UNION ALL
SELECT 2 AS user_ids, "coke" AS drinks UNION ALL
SELECT 4 AS user_ids, "coke" AS drinks UNION ALL
SELECT 5 AS user_ids, "coke" AS drinks 
),

# count how many people like each type of food

count_foods AS (
SELECT COUNT(foods) AS count_foods, foods FROM food GROUP BY foods
),

# count how many people like each type of drink

count_drinks AS(
SELECT COUNT(drinks) AS count_drinks, drinks FROM drink GROUP BY drinks
),

# making all the possible combinations between foods and drinks with CROSS JOIN

food_drink_only AS (
SELECT foods, drinks, count_foods, count_drinks FROM count_foods a CROSS JOIN count_drinks b
),

# people who like one food and a drink, for ex.: user_ids = 1 likes hamburguer and coke

like_both AS (
SELECT  COUNT(user_ids) AS count_both, foods, drinks FROM (SELECT DISTINCT user_ids, foods FROM food) 
INNER JOIN (SELECT DISTINCT user_ids, drinks FROM drink) USING(user_ids) GROUP BY  2,3
)

# Using left join with foods and drinks as primary keys because all the combinations (fodds, drinks) came from the CROSS JOIN

# and are in the left table

SELECT a.foods,a.drinks,a.count_foods,a.count_drinks, b.count_both FROM food_drink_only a
LEFT JOIN like_both b ON a.foods = b.foods AND a.drinks=b.drinks

以及输出,

Row foods   drinks  count_foods count_drinks    count_both
1   hamburguer  coke    3   5   2
2   pizza   coke    2   5   1
3   chicken coke    1   5   null

首先,请注意交叉连接为我们提供了 foods 以及 drinks . 因此,在创建最终输出时使用左连接。然后,注意 like_both 表中有2个用户喜欢可口可乐和汉堡,1个用户喜欢可口可乐和比萨饼,0个用户喜欢可口可乐和鸡肉。因此,田地 foods 以及 drinks 用作将此表与连接的主键 food_drink_only table。因此,数据被放置在正确的食物和饮料组合中。

相关问题