On Mon, Aug 20, 2012 at 12:19 PM, Anssi Kääriäinen <[email protected]>wrote:
> > > On 20 elo, 18:39, Tai Lee <[email protected]> wrote: > > I'd like to re-visit the discussion surrounding #7581 [1], a ticket about > > streaming responses that is getting quite long in the tooth now, which > > Jacob finally "accepted" 11 months ago (after a long time as DDN) and > said > > that it is clear we have to do *something*, but *what* remains to be > seen. > > > > I'd like to try provide a little refresher and summarise the options that > > have been suggested, and ask any core devs to please weigh in with their > > preference so that I can work up a new patch that will be more likely to > > gain acceptance. > > > > THE PROBLEM: > > > > 1. There are bugs and surprising behaviour that arise when creating an > > HttpResponse with a generator as its content, as a result of "quantum > > state" of `HttpResponse.content` (measuring it changes it). > > > > > > > > >>> from django.http import HttpResponse > > >>> r = HttpResponse(iter(['a', 'b'])) > > >>> r.content > > 'ab' > > >>> r.content > > '' > > >>> r2 = HttpResponse(iter(['a', 'b'])) > > >>> print r2.content > > ab > > >>> print r2.content > > >>> r3 = HttpResponse(iter(['a', 'b'])) > > >>> r3.content == r3.content > > > > False > > > > 2. Some middleware prematurely consume generator content by accessing > > `HttpResponse.content`, which can use a lot of memory and cause browser > > timeouts when attempting to stream large amounts of data or > > slow-to-generate data. > > > > There have been several tickets [2] [3] [4] and django-developers > > discussions [5] [6] [7] [8] about these issues. > > > > SOME USE CASES FOR STREAMING RESPONSES: > > > > A. Generating and exporting CSV data directly from the database. > > > > B. Restricting file access to authenticated users for files that may be > > hosted on external servers. > > > > C. Drip-feeding chunks of content to prevent timeout when requesting a > page > > that takes a long time to generate. > > > > OPTION 1: > > > > Remove support for "streaming" responses. If an iterator is passed in as > > content to `HttpResponse`, consume it in `HttpResponse.__init__()` to > > eliminate buggy behaviour. Middleware won't have to worry about what type > > of content a response has. > > > > Now that Jacob has accepted #7581 and said that it is clear we need to do > > *something*, I hope we can rule out this option. > > > > OPTION 2: > > > > Make `HttpResponse.__init__()` consume any iterator content, and add an > > `HttpResponseStreaming` class or an `HttpResponse(streaming=False)` > > argument. Allow middleware to check via `hasattr()` or `isinstance()` > > whether or not the response has generator content, and conditionally skip > > code that is incompatible with streaming responses. > > > > Some middleware will have to be updated for compatibility with streaming > > responses, and any 3rd party middleware that prematurely consumes > generator > > content will continue to work, only without the bugs (and potentially > with > > increased memory usage and browser timeouts). > > > > OPTION 3: > > > > Build a capabilities API for `HttpResponse` objects, and have middleware > > inspect responses to determine "can I read content?", "can I replace > > content?", "can I change etag?", etc. This will likely become a bigger > and > > more complicated design decision as we work out what capabilities we want > > to support. Some have argued that it should be sufficient to know if we > > have generator content or not, for all the cases that people have > reported > > so far. > > > > OPTION 4: > > > > Provide a way for developers to specify on an `HttpResponse` object or > > subclass that specific middleware should be skipped for that response. > This > > would be problematic because 3rd party views won't know what other > > middleware is installed in a project in order to name them for exclusion. > > > > OPTION 5: > > > > Add Yet Another Setting that would allow developers to define > > `CONDITIONAL_MIDDLEWARE_CLASSES` at a project level. At the project > level, > > developers would know which middleware classes they are using, and when > > they should be executed or skipped. This would give very fine grained > > control at a project level to match middleware conditionally with > > `HttpResponse` subclasses, without requiring any changes to existing or > 3rd > > party middleware. This could look something like this: > > > > MIDDLEWARE_CLASSES = ( > > 'django.middleware.common', > > ) > > > > CONDITIONAL_MIDDLEWARE_CLASSES = { > > 'exclude': { > > 'django.http.HttpResponseStreaming': ['django.middleware.common', > > 'otherapp.othermiddleware', ...], > > }, > > 'include': { > > 'myapp.MyHttpResponse': ['myapp.mymiddleware', ...], > > }, > > > > } > > My initial feeling is that anything accessing "large content" in > Python is broken and we should not try to make sure these things > continue to work. The problem is how we define if the content is too > large to access. Assuming any generator passed to the __init__ is > large maybe too drastic. > > We should either remove the content attribute altogether from > generator based responses, or at least raise an error if the content > is accessed a second time. The current behaviour is definitely broken, > and we should not silently return corrupt data, instead we should > complain loudly. > > To me option 5 feels like we are trying to work around the symptoms > instead of solving the underlying problem. > > - Anssi > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/django-developers?hl=en. > > My inclination is that we should kill .content entirely, middleware that wants to rewrite the response will have two choices: 1) explicitly evaluate the entire response, and return a new HttpResponse object 2) return s anew HttpResponse object that has a generator that iteratively evaluates the previous one and rewrite it incrementally. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
