Obviously you need to pickle the whole generated data, not just generators.
1] But I'd use response.content propery that returns the string no matter how the request is created. 2] Subclassing the cache middleware, or creating your own decorator for that seems like a cleaner way then pathing the original django sources :) Regards, A ------------------------------------------------------ Ales Zoulek Jabber: ales.zou...@gmail.com ------------------------------------------------------ On Tue, Dec 15, 2009 at 2:00 PM, Tom Evans <tevans...@googlemail.com> wrote: > Hi all > > (This is all running django 1.1.1) > > On our site, we have a lot of pages that take a long time to generate, > mainly because they make a lot of expensive SOAP-like calls to other > servers. Some of the pages take exceptionally long periods of time (> > 30 seconds) to generate a full web page. In order to output these > responses reliably, we use iterator objects to output the content of > the page bit by bit, eg: > > def iterator_resp(request): > def worker(): > yield "Hello" > yield " " > yield "world" > return HttpResponse(worker()) > > Since these pages take such a long time to generate, ideally we would > like to cache the generated content for the next time it is requested, > however it seems that any attempt to cache that view with the > @cache_page decorator is doomed to fail. > > The first problem occurs when the UpdateCacheMiddleware runs it's > process_response() phase, it calls > django.utils.cache.patch_response_headers(), which consumes the > generator function trying to calculate an MD5 sum of the contents for > an ETag. This leads to the generator being exhausted when we come to > output the response, and we get an empty response. > > This can be avoided by setting a manual ETag on the response, but in > this case python refuses to pickle the response object, since > generator objects are not picklable. > > I figure that this sort of caching should definitely be possible, so I > wrote a small patch to UpdateCacheMiddleware to notice when we are > supplying an iterable response, and then create a duplicate > HttpResponse, with a proxy generator that reads from the original > response, writing each chunk to a buffer and yield'ing it also to the > response. Once the original response is drained, I create another > response with the buffered content, and this response can be put into > the cache. > > Obviously, there are trade-offs here - if you want to cache a 50MB > page, you've got to be prepared to buffer 50MB - but is this the right > sort of approach to take? Is there a better way of achieving similar > results? I feel quite uncomfortable poking at the inside of > HttpResponse - looking at HttpResponse::_is_string in particular! > > Cheers > > Tom > > -- > > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-us...@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr...@googlegroups.com<django-users%2bunsubscr...@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > > > -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.