如何缓存分页的Django查询集

lqfhib0f  于 2023-10-21  发布在  Go
关注(0)|答案(4)|浏览(107)

如何缓存分页的Django查询集,特别是在ListView中?
我注意到一个查询需要很长时间才能运行,所以我试图缓存它。查询集非常大(超过100 k条记录),所以我尝试只缓存分页的子部分。我不能缓存整个视图或模板,因为有些部分是用户/会话特定的,需要不断更改。
ListView有两个标准方法来检索查询集,get_queryset(),它返回非分页数据,paginate_queryset(),它根据当前页面过滤数据。
我首先尝试在get_queryset()中缓存查询,但很快意识到调用cache.set(my_query_key, super(MyView, self).get_queryset())会导致整个查询被序列化。
然后我尝试重写paginate_queryset(),如下所示:

import time
from functools import partial
from django.core.cache import cache
from django.views.generic import ListView

class MyView(ListView):

    ...

    def paginate_queryset(self, queryset, page_size):
        cache_key = 'myview-queryset-%s-%s' % (self.page, page_size)
        print 'paginate_queryset.cache_key:',cache_key
        t0 = time.time()
        ret = cache.get(cache_key)
        if ret is None:
            print 're-caching'
            ret = super(MyView, self).paginate_queryset(queryset, page_size)
            cache.set(cache_key, ret, 60*60)
        td = time.time() - t0
        print 'paginate_queryset.time.seconds:',td
        (paginator, page, object_list, other_pages) = ret
        print 'total objects:',len(object_list)
        return ret

然而,这几乎需要一分钟的时间来运行,即使只有10个对象被检索,并且每个请求都显示“re-caching”,这意味着没有任何东西被保存到缓存中。
我的settings.CACHE看起来像:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': '127.0.0.1:11211',
    }
}

service memcached status显示memcached正在运行,而tail -f /var/log/memcached.log什么都没有显示。
我做错了什么?什么是正确的方式来缓存分页查询,使整个查询集是不是检索?
编辑:我认为这可能是memcached或Python Package 器中的一个bug。Django似乎支持两种不同的memcached后端,一种使用python-memcached,另一种使用pylibmc。python-memcached似乎隐藏了缓存paginate_queryset()值的错误。当我切换到pylibmc后端时,现在我得到一个显式的错误消息“error 10 from memcached_set:服务器错误”追溯到django/core/cache/backends/memcached.py in set,line 78.

pcrecxhr

pcrecxhr1#

您可以扩展Paginator以支持由提供的cache_key进行缓存。
关于这种CachedPaginator的使用和实现的博客文章可以在here上找到。源代码发布在djangosnippets.org上(这里是web-acrhive link,因为原始代码不工作)。
不过,我将发布一个从原始版本稍微修改的示例,它不仅可以缓存每页的对象,还可以缓存总计数。(有时甚至计数也是一项昂贵的操作)。

from django.core.cache import cache
from django.utils.functional import cached_property
from django.core.paginator import Paginator, Page, PageNotAnInteger

class CachedPaginator(Paginator):
    """A paginator that caches the results on a page by page basis."""
    def __init__(self, object_list, per_page, orphans=0, allow_empty_first_page=True, cache_key=None, cache_timeout=300):
        super(CachedPaginator, self).__init__(object_list, per_page, orphans, allow_empty_first_page)
        self.cache_key = cache_key
        self.cache_timeout = cache_timeout

    @cached_property
    def count(self):
        """
            The original django.core.paginator.count attribute in Django1.8
            is not writable and cant be setted manually, but we would like
            to override it when loading data from cache. (instead of recalculating it).
            So we make it writable via @cached_property.
        """
        return super(CachedPaginator, self).count

    def set_count(self, count):
        """
            Override the paginator.count value (to prevent recalculation)
            and clear num_pages and page_range which values depend on it.
        """
        self.count = count
        # if somehow we have stored .num_pages or .page_range (which are cached properties)
        # this can lead to wrong page calculations (because they depend on paginator.count value)
        # so we clear their values to force recalculations on next calls
        try:
            del self.num_pages
        except AttributeError:
            pass
        try:
            del self.page_range
        except AttributeError:
            pass

    @cached_property
    def num_pages(self):
        """This is not writable in Django1.8. We want to make it writable"""
        return super(CachedPaginator, self).num_pages

    @cached_property
    def page_range(self):
        """This is not writable in Django1.8. We want to make it writable"""
        return super(CachedPaginator, self).page_range

    def page(self, number):
        """
        Returns a Page object for the given 1-based page number.

        This will attempt to pull the results out of the cache first, based on
        the requested page number. If not found in the cache,
        it will pull a fresh list and then cache that result + the total result count.
        """
        if self.cache_key is None:
            return super(CachedPaginator, self).page(number)

        # In order to prevent counting the queryset
        # we only validate that the provided number is integer
        # The rest of the validation will happen when we fetch fresh data.
        # so if the number is invalid, no cache will be setted
        # number = self.validate_number(number)
        try:
            number = int(number)
        except (TypeError, ValueError):
            raise PageNotAnInteger('That page number is not an integer')

        page_cache_key = "%s:%s:%s" % (self.cache_key, self.per_page, number)
        page_data = cache.get(page_cache_key)

        if page_data is None:
            page = super(CachedPaginator, self).page(number)
            #cache not only the objects, but the total count too.
            page_data = (page.object_list, self.count)
            cache.set(page_cache_key, page_data, self.cache_timeout)
        else:
            cached_object_list, cached_total_count = page_data
            self.set_count(cached_total_count)
            page = Page(cached_object_list, number, self)

        return page
p3rjfoxz

p3rjfoxz2#

问题原来是多种因素综合作用的结果。主要是,paginate_queryset()返回的结果包含对无限查询集的引用,这意味着它本质上是不可访问的。当我调用cache.set(mykey, (paginator, page, object_list, other_pages))时,它试图序列化数千条记录,而不是我期望的page_size数量的记录,导致缓存项超出memcached的限制并失败。
另一个因素是在memcached/python-memcached中可怕的默认错误报告,它默默地隐藏了所有错误,并在出现错误时将cache.set()转换为nop,这使得跟踪问题非常耗时。
我通过重写paginate_queryset()来修复这个问题,以完全放弃Django的内置分页器功能,并自己计算查询集:

object_list = queryset[page_size*(page-1):page_size*(page-1)+page_size]

然后缓存那个object_list

laximzn5

laximzn53#

我想在我的主页上对我的无限滚动视图进行分页,这是我想出的解决方案。它是Django CCBV和作者的初始解决方案的混合。
响应时间,然而,并没有改善,因为我希望,但这可能是因为我测试它对我的本地只有6个职位和2个用户哈哈。

# Import
    from django.core.cache import cache
    from django.core.paginator import InvalidPage
    from django.views.generic.list import ListView
    from django.http Http404

    class MyListView(ListView):
    template_name = 'MY TEMPLATE NAME'
    model = MY POST MODEL
    paginate_by = 10


    def paginate_queryset(self, queryset, page_size):

        """Paginate the queryset"""
        paginator = self.get_paginator(
            queryset, page_size, orphans=self.get_paginate_orphans(),
            allow_empty_first_page=self.get_allow_empty())

        page_kwarg = self.page_kwarg

        page = self.kwargs.get(page_kwarg) or self.request.GET.get(page_kwarg) or 1

        try:
            page_number = int(page)

        except ValueError:
            if page == 'last':
                page_number = paginator.num_pages

            else:
                raise Http404(_("Page is not 'last', nor can it be converted to an int."))
        try:
            page = paginator.page(page_number)
            cache_key = 'mylistview-%s-%s' % (page_number, page_size)
            retreive_cache = cache.get(cache_key)

            if retreive_cache is None:
                print('re-caching')
                retreive_cache = super(MyListView, self).paginate_queryset(queryset, page_size)

                # Caching for 1 day
                cache.set(cache_key, retreive_cache, 86400)

            return retreive_cache
        except InvalidPage as e:
            raise Http404(_('Invalid page (%(page_number)s): %(message)s') % {
                'page_number': page_number,
                'message': str(e)
            })
bcs8qyzn

bcs8qyzn4#

下面是如何使用Todor的answerListView中缓存分页的解释。假设您的应用程序中有多个ListView。它们中的每一个都需要自己不同的cache_key。添加paginator_class = CachedPaginator并通过父类覆盖get_paginator函数。

from myapp.utils import CachedPaginator

class ModelAView(ListView):
    model = ModelA
    template_name = "model_a.html"
    paginator_class = CachedPaginator  # instead of default Paginator
    paginate_by = 20

    def get_paginator(
        self, queryset, per_page, orphans=0, allow_empty_first_page=True, **kwargs
    ):
        paginator_cache_key = "model_a_" + str(self.kwargs["model_a_pk"])
        return self.paginator_class(
            queryset,
            per_page,
            orphans=orphans,
            allow_empty_first_page=allow_empty_first_page,
            cache_key=paginator_cache_key,
            **kwargs,
        )

相关问题