postgresql 查找不同行的列之间的最大差值

s8vozzvw  于 2023-03-01  发布在  PostgreSQL
关注(0)|答案(5)|浏览(221)

给定以下数据:
| 课程_ID|教师_ID|最低等级|最大_等级|
| - ------|- ------|- ------|- ------|
| 1个|1个|七|八个|
| 第二章|第二章|五个|六个|
| 三个|第二章|五个|七|
| 四个|三个|五个|八个|
| 五个|三个|六个|七|
| 六个|四个|五个|六个|
| 七|四个|四个|六个|
| 八个|四个|五个|七|
使用Postgresql,我需要写一个查询:
选择教授2门单独课程X和Y的所有教师,其中课程X的最低成绩与课程Y的最高成绩之间的差大于2。
因此,只应选择教师4。
我尝试了以下方法:

SELECT Teacher_ID, MAX(Maximum_grade), MIN(Minimum_grade)
FROM dataset
GROUP BY Teacher_ID
HAVING count(Teacher_ID) > 1 AND (MAX(Maximum_grade) - Min(Minimum_grade)) > 2;

这选择了教师3和4,这不应该是这种情况。

    • 我认为这是因为我编写的代码将课程4的最低成绩与同一课程的最高成绩进行比较,这是不应该发生的。**

我不知道如何改进我的代码,以获得理想的结果,任何帮助将不胜感激。

h4cxqtbf

h4cxqtbf1#

我会从你的表中选择两次来解决这个问题,第一次是为了找到最小的分数,第二次是为了找到最大的分数。当我们把这些分数连接在一起时,我们希望对共享ID的教师也这样做,但要确保班级有不同的ID:

SELECT
    a.Teacher_ID,
    MAX(b.Maximum_grade - a.Minimum_grade) AS diff
FROM
    dataset as a JOIN
    dataset as b ON
    (
       a.Teacher_ID = b.Teacher_ID AND
       a.Course_ID != b.Course_ID
    )
GROUP BY
    a.Teacher_ID
HAVING
    MAX(b.Maximum_grade - a.Minimum_grade) > 2

你可以在这里试试,虽然我不知道这个网站保持这些小提琴周围多久:http://sqlfiddle.com/#!9/1c4a72/1

xqk2d5yq

xqk2d5yq2#

这个查询仍然可以优化(或者)转换成子查询,这是我的头上的,这包括两个部分,首先我们需要满足两个条件,
1.教授两门独立课程X和Y的所有教师
1.路线X的最小等级和路线Y的最大等级之间的差值大于2。

select teacher_id, max(max_g) as max_g, min(min_g) as min_g from dataset 
group by teacher_id 
having count(distinct course_id) >= 2 and (max(max_g)- min(min_g)) > 2

上面的查询取的是上述条件的记录,只是有一点矛盾,矛盾的是,有时候b/w的差值是在同一个科目内计算的,即
max(课程X的最大_成绩)- min(课程X的最小_成绩)
为了纠正这一点,我只选择max(max_grade)和min(min_grade)来自不同行的记录(因此课程也将不同)。

with diff_gt_two as
(   
   select 
        teacher_id, 
        max(max_g) as max_g, min(min_g) as min_g 
   from dataset     
   group by teacher_id 
   having count(distinct course_id) >= 2 and 
   (max(max_g)- min(min_g)) > 2
)
select dataset.teacher_id from dataset,diff_gt_two
where 
     dataset.teacher_id = diff_gt_two.teacher_id 
     and 
     ( dataset.max_g = diff_gt_two.max_g or     dataset.min_g = 
       diff_gt_two.min_g) 
group by dataset.teacher_id
having count(*) > 1

编辑:@JonSG已将CTE转换为子查询,篡改链接:http://sqlfiddle.com/#!9/1c 4a 72/9 .谢谢您,JonSG

i86rm4rw

i86rm4rw3#

您可以连接数据集,使其自身与同一教师和不同课程相交,并从该笛卡尔积中获得每个教师的最大差异:

SELECT A.Teacher_ID, max(A.Maximum_grade-B.Minimum_grade) as Diff
FROM dataset A,dataset B 
WHERE A.Teacher_ID = B.teacher_ID and A.Course_ID <> B.Course_ID
GROUP BY A.Teacher_ID
HAVING Diff > 2;
  • 由于加入条件要求不同的课程,只有一门课程的教师将不会出来,同一门课程的成绩差异将被排除在外 *
hfyxw5xn

hfyxw5xn4#

SELECT *
FROM (
    SELECT Teacher_ID , max(diff) as DIFF, COUNT(Course_ID) as NUMBER_OF_COURSE
    FROM (
        SELECT *, (Maximum_grade  - Minimum_grade) AS diff
        FROM dataset
    )
    GROUP BY Teacher_ID
)
WHERE DIFF > 2

结果=教师标识3,差值3,课程编号2

5f0d552i

5f0d552i5#

使用Python

DF:

import pandas as pd
df = pd.DataFrame({'CID': [1,2,3,4,5,6,7,8,8],
                   'TID': [1,2,2,3,3,4,4,4,4],
                   'Min': [7,5,5,5,6,5,4,5,6],
                   'Max': [8,6,7,8,7,6,6,7,7]})

代码:

ans = []   #Creating new list where we will store our result

#Creating new list with the teachers who attend two or more thn two courses
TID = [k for k, v in df.groupby(['TID'])['CID'].count().to_dict().items() if v>=2]


for T in TID:    #Loop Over Teachers

    #Lets just select the Min value row
    d = df.loc[df.loc[df['TID']==T][['Min']].idxmin().values[0]]

    #Lets add Max to row, while filtering just not select the Min CID here
    d['Max'] = max(df.loc[(df['TID']==T) & (df['CID']!=d['CID'])]['Max'].tolist())

    if(d['Max'] - d['Min']> 2):
         ans.append({T:  d['Max'] - d['Min']})
    
ans

输出:

[{4: 3}]  ### CID : DIff

相关问题