postgresql Django Postgres数组字段-计算重叠的数量?

2uluyalo  于 2023-04-20  发布在  PostgreSQL
关注(0)|答案(1)|浏览(127)

我有一个Article模型,其中包含一个keywords字段,该字段是按升序排序的关键字列表的ArrayField。我想做一个查询,查找所有具有最小数量的关键字重叠的文章。
示例:

article_a = Article(keywords=["tag1", "tag2", "tag3"]
article_b = Article(keywords=["tag1", "tag2", ]
article_c = Article(keywords=["tag1", ]

article_a.find_similar_articles(min_overlap=2)
  # Returns [article_b, ] since it overlaps with at least 2 elements.

有一个similar question here是用于一般的Postgres,而不是Django ORM。
有谁知道我如何以这种方式查询数组字段吗?或者你有另一种方法的建议,通过以另一种方式结构数据来实现相同的结果?

fcg9iug3

fcg9iug31#

给定您的模型:

from django.contrib.postgres.fields import ArrayField
from django.db import models

class Article(models.Model):

    keywords = ArrayField(models.CharField(max_length=50), default=list)

您可以使用RawSQL表达式获取重叠的关键字和长度:

from django.db.models.expressions import RawSQL
from django.contrib.postgres.fields import ArrayField
from django.db.models import CharField, Func, F, IntegerField

article = Article.objects.first()

similar_articles = (
    Article.objects.annotate(
        overlap=RawSQL(
            sql="ARRAY(select UNNEST(%s) INTERSECT select UNNEST(keywords))",
            params=(article.keywords,),
            output_field=ArrayField(CharField(max_length=50)),
        )
    ).annotate(
        overlap_count=Func(F("overlap"), function="CARDINALITY", output_field=IntegerField())
    ).filter(overlap_count__gte=2) # min_overlap of 2
)

我的答案很大程度上受到了这个SO answer的启发,它解释了如何在PostgreSQL上实现它。

相关问题