Re: Caching pages that take a long time to generate

Ales Zoulek Wed, 16 Dec 2009 13:01:29 -0800

Obviously you need to pickle the whole generated data, not just generators.


1] But I'd use response.content propery that returns the string no matter
how the request is created.

2] Subclassing the cache middleware, or creating your own decorator for that
seems like a cleaner way then pathing the original django sources :)


Regards,

A

------------------------------------------------------
Ales Zoulek
Jabber: ales.zou...@gmail.com
------------------------------------------------------


On Tue, Dec 15, 2009 at 2:00 PM, Tom Evans <tevans...@googlemail.com> wrote:

> Hi all
>
> (This is all running django 1.1.1)
>
> On our site, we have a lot of pages that take a long time to generate,
> mainly because they make a lot of expensive SOAP-like calls to other
> servers. Some of the pages take exceptionally long periods of time (>
> 30 seconds) to generate a full web page. In order to output these
> responses reliably, we use iterator objects to output the content of
> the page bit by bit, eg:
>
> def iterator_resp(request):
>  def worker():
>    yield "Hello"
>    yield " "
>    yield "world"
>  return HttpResponse(worker())
>
> Since these pages take such a long time to generate, ideally we would
> like to cache the generated content for the next time it is requested,
> however it seems that any attempt to cache that view with the
> @cache_page decorator is doomed to fail.
>
> The first problem occurs when the UpdateCacheMiddleware runs it's
> process_response() phase, it calls
> django.utils.cache.patch_response_headers(), which consumes the
> generator function trying to calculate an MD5 sum of the contents for
> an ETag. This leads to the generator being exhausted when we come to
> output the response, and we get an empty response.
>
> This can be avoided by setting a manual ETag on the response, but in
> this case python refuses to pickle the response object, since
> generator objects are not picklable.
>
> I figure that this sort of caching should definitely be possible, so I
> wrote a small patch to UpdateCacheMiddleware to notice when we are
> supplying an iterable response, and then create a duplicate
> HttpResponse, with a proxy generator that reads from the original
> response, writing each chunk to a buffer and yield'ing it also to the
> response. Once the original response is drained, I create another
> response with the buffered content, and this response can be put into
> the cache.
>
> Obviously, there are trade-offs here - if you want to cache a 50MB
> page, you've got to be prepared to buffer 50MB - but is this the right
> sort of approach to take? Is there a better way of achieving similar
> results? I feel quite uncomfortable poking at the inside of
> HttpResponse - looking at HttpResponse::_is_string in particular!
>
> Cheers
>
> Tom
>
> --
>
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To post to this group, send email to django-us...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users+unsubscr...@googlegroups.com<django-users%2bunsubscr...@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.
>
>
>

--

You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: Caching pages that take a long time to generate

Reply via email to