I'd like to re-visit the discussion surrounding #7581 [1], a ticket about 
streaming responses that is getting quite long in the tooth now, which 
Jacob finally "accepted" 11 months ago (after a long time as DDN) and said 
that it is clear we have to do *something*, but *what* remains to be seen.

I'd like to try provide a little refresher and summarise the options that 
have been suggested, and ask any core devs to please weigh in with their 
preference so that I can work up a new patch that will be more likely to 
gain acceptance.


THE PROBLEM:

1. There are bugs and surprising behaviour that arise when creating an 
HttpResponse with a generator as its content, as a result of "quantum 
state" of `HttpResponse.content` (measuring it changes it).

>>> from django.http import HttpResponse
>>> r = HttpResponse(iter(['a', 'b'])) 
>>> r.content 
'ab'
>>> r.content 
''
>>> r2 = HttpResponse(iter(['a', 'b'])) 
>>> print r2.content
ab
>>> print r2.content

>>> r3 = HttpResponse(iter(['a', 'b']))
>>> r3.content == r3.content
False

2. Some middleware prematurely consume generator content by accessing 
`HttpResponse.content`, which can use a lot of memory and cause browser 
timeouts when attempting to stream large amounts of data or 
slow-to-generate data.

There have been several tickets [2] [3] [4] and django-developers 
discussions [5] [6] [7] [8] about these issues.


SOME USE CASES FOR STREAMING RESPONSES:

A. Generating and exporting CSV data directly from the database.

B. Restricting file access to authenticated users for files that may be 
hosted on external servers.

C. Drip-feeding chunks of content to prevent timeout when requesting a page 
that takes a long time to generate.


OPTION 1:

Remove support for "streaming" responses. If an iterator is passed in as 
content to `HttpResponse`, consume it in `HttpResponse.__init__()` to 
eliminate buggy behaviour. Middleware won't have to worry about what type 
of content a response has.

Now that Jacob has accepted #7581 and said that it is clear we need to do 
*something*, I hope we can rule out this option.


OPTION 2:

Make `HttpResponse.__init__()` consume any iterator content, and add an 
`HttpResponseStreaming` class or an `HttpResponse(streaming=False)` 
argument. Allow middleware to check via `hasattr()` or `isinstance()` 
whether or not the response has generator content, and conditionally skip 
code that is incompatible with streaming responses.

Some middleware will have to be updated for compatibility with streaming 
responses, and any 3rd party middleware that prematurely consumes generator 
content will continue to work, only without the bugs (and potentially with 
increased memory usage and browser timeouts).


OPTION 3:

Build a capabilities API for `HttpResponse` objects, and have middleware 
inspect responses to determine "can I read content?", "can I replace 
content?", "can I change etag?", etc. This will likely become a bigger and 
more complicated design decision as we work out what capabilities we want 
to support. Some have argued that it should be sufficient to know if we 
have generator content or not, for all the cases that people have reported 
so far.


OPTION 4:

Provide a way for developers to specify on an `HttpResponse` object or 
subclass that specific middleware should be skipped for that response. This 
would be problematic because 3rd party views won't know what other 
middleware is installed in a project in order to name them for exclusion.


OPTION 5:

Add Yet Another Setting that would allow developers to define 
`CONDITIONAL_MIDDLEWARE_CLASSES` at a project level. At the project level, 
developers would know which middleware classes they are using, and when 
they should be executed or skipped. This would give very fine grained 
control at a project level to match middleware conditionally with 
`HttpResponse` subclasses, without requiring any changes to existing or 3rd 
party middleware. This could look something like this:

MIDDLEWARE_CLASSES = (
    'django.middleware.common',
)

CONDITIONAL_MIDDLEWARE_CLASSES = {
    'exclude': {
        'django.http.HttpResponseStreaming': ['django.middleware.common', 
'otherapp.othermiddleware', ...],
    },
    'include': {
        'myapp.MyHttpResponse': ['myapp.mymiddleware', ...],
    },
}


MY TAKE:

I think that option 1 and option 4 are non-starters.

I think option 3 is perhaps a little overkill and will be more difficult to 
get committed once we start thinking about what capabilities we want to 
support.

I think option 2 is probably going to be the easiest solution. It's 
practically implemented and up-to-date already (missing docs and tests).

Although it does involve Yet Another Setting, I think option 5 provides the 
most flexibility, where it is most needed. It gives developers working at 
the project level a way to override and conditionally skip or execute 3rd 
party middleware without having to make any changes to 3rd party middleware.

I would be happy with either option 2 or 5, or a variation.


NEXT STEPS:

I'd really like to see this and the related tickets closed (preferably 
marked "fixed"!) :)

I'm specifically looking for opinions and direction from any of the core 
devs, especially those who have previously commented on the ticket or in 
the discussions, even if it is just to permanently reject some of the 
options.

I'd also like to hear from anyone who has a new use case or a new solution 
to suggest.

I will happily work up a new patch if required (including docs and tests), 
even if just a proof of concept. I just need to know which way the core 
devs would like me to go.

Thanks.
Tai.


[1] https://code.djangoproject.com/ticket/7581

[2] https://code.djangoproject.com/ticket/6027
[3] https://code.djangoproject.com/ticket/6527
[4] https://code.djangoproject.com/ticket/12214

[5] 
https://groups.google.com/d/topic/django-developers/Pg14uYSYwVk/discussion
[6] 
https://groups.google.com/d/topic/django-developers/UGwIoJUgWTw/discussion
[7] 
https://groups.google.com/d/topic/django-developers/9bpt8EAKnFc/discussion
[8] 
https://groups.google.com/d/topic/django-developers/RihIecNxpKE/discussion

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/FzSFxeUk3BwJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to