python 在Django中一次更新多个对象？

2izufjch 于 2023-05-27 发布在 Python

关注(0)|答案(4)|浏览(90)

我用的是Django 1.9。我有一个Django表，它代表了一个特定度量的值，按组织按月，原始值和百分位数：

class MeasureValue(models.Model):
    org = models.ForeignKey(Org, null=True, blank=True)
    month = models.DateField()
    calc_value = models.FloatField(null=True, blank=True)
    percentile = models.FloatField(null=True, blank=True)

一般来说，每月有10，000人左右。我的问题是关于我是否可以加快在模型上设置值的过程。
目前，我通过使用Django过滤器查询检索一个月的所有测量值来计算百分位数，将其转换为pandas Dataframe ，然后使用scipy的rankdata设置排名和百分位数。我这样做是因为pandas和rankdata是高效的，能够忽略空值，并且能够以我想要的方式处理重复的值，所以我对这个方法很满意：

records = MeasureValue.objects.filter(month=month).values()
df = pd.DataFrame.from_records(records)
// use calc_value to set percentile on each row, using scipy's rankdata

但是，我需要从 Dataframe 中检索每个百分位值，并将其设置回模型示例。现在我通过迭代dataframe的行并更新每个示例来做到这一点：

for i, row in df.iterrows():
    mv = MeasureValue.objects.get(org=row.org, month=month)
    if (row.percentile is None) or np.isnan(row.percentile):
        row.percentile = None
    mv.percentile = row.percentile
    mv.save()

这是不足为奇的相当缓慢。Django是否有任何有效的方法来加速它，通过使单个数据库写入而不是成千上万？我检查了文件，但没有看到。

python

来源：https://stackoverflow.com/questions/36751395/update-multiple-objects-at-once-in-django

4条答案

按热度按时间

a9wyjsp71#

原子事务可以减少循环中花费的时间：

from django.db import transaction

with transaction.atomic():
    for i, row in df.iterrows():
        mv = MeasureValue.objects.get(org=row.org, month=month)

        if (row.percentile is None) or np.isnan(row.percentile): 
            # if it's already None, why set it to None?
            row.percentile = None

        mv.percentile = row.percentile
        mv.save()

Django的默认行为是在自动提交模式下运行。除非事务处于活动状态，否则每个查询都会立即提交到数据库。
通过使用with transaction.atomic()，所有插入都被分组到单个事务中。提交事务所需的时间分摊到所有包含的insert语句中，因此每个insert语句的时间大大减少。

赞(0）回复(0）举报 2023-05-27

okxuctiv2#

从Django 2.2开始，你可以使用bulk_update() queryset方法来高效地更新提供的模型示例上的给定字段，通常只需要一个查询：

objs = [
    Entry.objects.create(headline='Entry 1'),
    Entry.objects.create(headline='Entry 2'),
]
objs[0].headline = 'This is entry 1'
objs[1].headline = 'This is entry 2'
Entry.objects.bulk_update(objs, ['headline'])

在旧版本的Django中，你可以使用update()和Case/When，例如：

from django.db.models import Case, When

Entry.objects.filter(
    pk__in=headlines  # `headlines` is a pk -> headline mapping
).update(
    headline=Case(*[When(pk=entry_pk, then=headline)
                    for entry_pk, headline in headlines.items()]))

赞(0）回复(0）举报 2023-05-27

ewm0tg9j3#

事实上，尝试@尤金Yarmash的答案时，我发现我得到了这个错误：
FieldError: Joined field references are not permitted in this query
但是我相信迭代update仍然比多次保存要快，我希望使用事务也能加快速度。
因此，对于不提供bulk_update的Django版本，假设尤金的答案中使用了相同的数据，其中headlines是pk -> headlineMap：

from django.db import transaction

with transaction.atomic():
    for entry_pk, headline in headlines.items():
        Entry.objects.filter(pk=entry_pk).update(headline=headline)

赞(0）回复(0）举报 2023-05-27

inn6fuwd4#

在我的例子中，我们需要价值
headlines is a pk -> headline mapping{1, 'some_val1', 2, 'some_val2', ...}`

from django.db.models import Case, When, Value

Entry.objects.filter(
pk__in=headlines  
).update(
headline=Case(*[When(pk=entry_pk, then=Value(headline))
                for entry_pk, headline in headlines.items()]))

赞(0）回复(0）举报 2023-05-27

我来回答

python 在Django中一次更新多个对象？

4条答案

相关问题

热门标签

最新问答