Django如何为ManyToMany字段创建bulk_create?

mjqavswn  于 2023-03-24  发布在  Go
关注(0)|答案(2)|浏览(136)

我有这个代码表填充。

def add_tags(count):
    print "Add tags"
    insert_list = []
    photo_pk_lower_bound = Photo.objects.all().order_by("id")[0].pk
    photo_pk_upper_bound = Photo.objects.all().order_by("-id")[0].pk
    for i in range(count):
        t = Tag( tag = 'tag' + str(i) )
        insert_list.append(t)
    Tag.objects.bulk_create(insert_list)
    for i in range(count):
        random_photo_pk = randint(photo_pk_lower_bound, photo_pk_upper_bound)
        p = Photo.objects.get( pk = random_photo_pk )
        t = Tag.objects.get( tag = 'tag' + str(i) )
        t.photos.add(p)

这是模型:

class Tag(models.Model):
    tag = models.CharField(max_length=20,unique=True)
    photos = models.ManyToManyField(Photo)

我理解这个答案:Django:此函数的关键字参数无效,我必须首先保存标签对象(由于ManyToMany字段),然后通过add()将照片附加到它们。但是对于大型count,这个过程太长了。有没有什么方法可以重构此代码以使其更快?
一般来说,我想用随机的虚拟数据填充标签模型。

编辑1(照片模型)

class Photo(models.Model):
    photo = models.ImageField(upload_to="images")
    created_date = models.DateTimeField(auto_now=True)
    user = models.ForeignKey(User)

    def __unicode__(self):
       return self.photo.name
qxsslcnc

qxsslcnc1#

TL;DR使用Django自动生成的**“through”**模型批量插入m2m关系。

"Tag.photos.through" => Django generated Model with 3 fields [ id, photo, tag ]
photo_tag_1 = Tag.photos.through(photo_id=1, tag_id=1)
photo_tag_2 = Tag.photos.through(photo_id=1, tag_id=2)
Tag.photos.through.objects.bulk_insert([photo_tag_1, photo_tag_2, ...])

这是我所知道的最快的方法,我一直用它来创建测试数据。我可以在几分钟内生成数百万条记录。

来自Georgy的编辑:

def add_tags(count):
    Tag.objects.bulk_create([Tag(tag='tag%s' % t) for t in range(count)])

    tag_ids = list(Tag.objects.values_list('id', flat=True))
    photo_ids = Photo.objects.values_list('id', flat=True)
    tag_count = len(tag_ids)
       
    for photo_id in photo_ids:
        tag_to_photo_links = []
        shuffle(tag_ids)

        rand_num_tags = randint(0, tag_count)
        photo_tags = tag_ids[:rand_num_tags]

        for tag_id in photo_tags:
            # through is the model generated by django to link m2m between tag and photo
            photo_tag = Tag.photos.through(tag_id=tag_id, photo_id=photo_id)
            tag_to_photo_links.append(photo_tag)

        Tag.photos.through.objects.bulk_create(tag_to_photo_links, batch_size=7000)

我没有创建模型来测试,但结构是存在的,你可能需要调整一些东西来使其工作。如果你遇到任何问题,请告诉我。
[编辑]

iecba09b

iecba09b2#

正如Du D的回答所示,Django ManyToMany字段使用了一个名为through的表,其中包含三列:关系的ID、链接到 * 的对象的ID和链接到 * 的对象的ID。您可以在through上使用bulk_create来批量创建ManyToMany关系。
举个简单的例子,你可以批量创建标签到照片的关系,如下所示:

tag1 = Tag.objects.get(id=1)
tag2 = Tag.objects.get(id=2)
photo1 = Photo.objects.get(id=1)
photo2 = Photo.objects.get(id=2)

through_objs = [
    Tag.photos.through(
        photo_id=photo1.id,
        tag_id=tag1.id,
    ),
    Tag.photos.through(
        photo_id=photo1.id,
        tag_id=tag2.id,
    ),
    Tag.photos.through(
        photo_id=photo2.id,
        tag_id=tag2.id,
    ),
]
Tag.photos.through.objects.bulk_create(through_objs)

一般溶液

下面是一个通用的解决方案,您可以运行它来设置任何对象对列表之间的ManyToMany关系。

from typing import Iterable
from collections import namedtuple

ManyToManySpec = namedtuple(
    "ManyToManySpec", ["from_object", "to_object"]
)

def bulk_create_manytomany_relations(
    model_from,
    field_name: str,
    model_from_name: str,
    model_to_name: str,
    specs: Iterable[ManyToManySpec]
):
    through_objs = []
    for spec in specs:
        through_objs.append(
            getattr(model_from, field_name).through(
                **{
                    f"{model_from_name.lower()}_id": spec.from_object.id,
                    f"{model_to_name.lower()}_id": spec.to_object.id,
                }
            )
        )
    getattr(model_from, field_name).through.objects.bulk_create(through_objs)

用法示例

tag1 = Tag.objects.get(id=1)
tag2 = Tag.objects.get(id=2)
photo1 = Photo.objects.get(id=1)
photo2 = Photo.objects.get(id=2)

bulk_create_manytomany_relations(
    model_from=Tag,
    field_name="photos",
    model_from_name="tag",
    model_to_name="photo",
    specs=[
        ManyToManySpec(from_object=tag1, to_object=photo1),
        ManyToManySpec(from_object=tag1, to_object=photo2),
        ManyToManySpec(from_object=tag2, to_object=photo2),
    ]
)

相关问题