I'd like to re-visit the discussion surrounding #7581 [1], a ticket about
streaming responses that is getting quite long in the tooth now, which
Jacob finally "accepted" 11 months ago (after a long time as DDN) and said
that it is clear we have to do *something*, but *what* remains to be seen.
I'd like to try provide a little refresher and summarise the options that
have been suggested, and ask any core devs to please weigh in with their
preference so that I can work up a new patch that will be more likely to
gain acceptance.
THE PROBLEM:
1. There are bugs and surprising behaviour that arise when creating an
HttpResponse with a generator as its content, as a result of "quantum
state" of `HttpResponse.content` (measuring it changes it).
>>> from django.http import HttpResponse
>>> r = HttpResponse(iter(['a', 'b']))
>>> r.content
'ab'
>>> r.content
''
>>> r2 = HttpResponse(iter(['a', 'b']))
>>> print r2.content
ab
>>> print r2.content
>>> r3 = HttpResponse(iter(['a', 'b']))
>>> r3.content == r3.content
False
2. Some middleware prematurely consume generator content by accessing
`HttpResponse.content`, which can use a lot of memory and cause browser
timeouts when attempting to stream large amounts of data or
slow-to-generate data.
There have been several tickets [2] [3] [4] and django-developers
discussions [5] [6] [7] [8] about these issues.
SOME USE CASES FOR STREAMING RESPONSES:
A. Generating and exporting CSV data directly from the database.
B. Restricting file access to authenticated users for files that may be
hosted on external servers.
C. Drip-feeding chunks of content to prevent timeout when requesting a page
that takes a long time to generate.
OPTION 1:
Remove support for "streaming" responses. If an iterator is passed in as
content to `HttpResponse`, consume it in `HttpResponse.__init__()` to
eliminate buggy behaviour. Middleware won't have to worry about what type
of content a response has.
Now that Jacob has accepted #7581 and said that it is clear we need to do
*something*, I hope we can rule out this option.
OPTION 2:
Make `HttpResponse.__init__()` consume any iterator content, and add an
`HttpResponseStreaming` class or an `HttpResponse(streaming=False)`
argument. Allow middleware to check via `hasattr()` or `isinstance()`
whether or not the response has generator content, and conditionally skip
code that is incompatible with streaming responses.
Some middleware will have to be updated for compatibility with streaming
responses, and any 3rd party middleware that prematurely consumes generator
content will continue to work, only without the bugs (and potentially with
increased memory usage and browser timeouts).
OPTION 3:
Build a capabilities API for `HttpResponse` objects, and have middleware
inspect responses to determine "can I read content?", "can I replace
content?", "can I change etag?", etc. This will likely become a bigger and
more complicated design decision as we work out what capabilities we want
to support. Some have argued that it should be sufficient to know if we
have generator content or not, for all the cases that people have reported
so far.
OPTION 4:
Provide a way for developers to specify on an `HttpResponse` object or
subclass that specific middleware should be skipped for that response. This
would be problematic because 3rd party views won't know what other
middleware is installed in a project in order to name them for exclusion.
OPTION 5:
Add Yet Another Setting that would allow developers to define
`CONDITIONAL_MIDDLEWARE_CLASSES` at a project level. At the project level,
developers would know which middleware classes they are using, and when
they should be executed or skipped. This would give very fine grained
control at a project level to match middleware conditionally with
`HttpResponse` subclasses, without requiring any changes to existing or 3rd
party middleware. This could look something like this:
MIDDLEWARE_CLASSES = (
'django.middleware.common',
)
CONDITIONAL_MIDDLEWARE_CLASSES = {
'exclude': {
'django.http.HttpResponseStreaming': ['django.middleware.common',
'otherapp.othermiddleware', ...],
},
'include': {
'myapp.MyHttpResponse': ['myapp.mymiddleware', ...],
},
}
MY TAKE:
I think that option 1 and option 4 are non-starters.
I think option 3 is perhaps a little overkill and will be more difficult to
get committed once we start thinking about what capabilities we want to
support.
I think option 2 is probably going to be the easiest solution. It's
practically implemented and up-to-date already (missing docs and tests).
Although it does involve Yet Another Setting, I think option 5 provides the
most flexibility, where it is most needed. It gives developers working at
the project level a way to override and conditionally skip or execute 3rd
party middleware without having to make any changes to 3rd party middleware.
I would be happy with either option 2 or 5, or a variation.
NEXT STEPS:
I'd really like to see this and the related tickets closed (preferably
marked "fixed"!) :)
I'm specifically looking for opinions and direction from any of the core
devs, especially those who have previously commented on the ticket or in
the discussions, even if it is just to permanently reject some of the
options.
I'd also like to hear from anyone who has a new use case or a new solution
to suggest.
I will happily work up a new patch if required (including docs and tests),
even if just a proof of concept. I just need to know which way the core
devs would like me to go.
Thanks.
Tai.
[1] https://code.djangoproject.com/ticket/7581
[2] https://code.djangoproject.com/ticket/6027
[3] https://code.djangoproject.com/ticket/6527
[4] https://code.djangoproject.com/ticket/12214
[5]
https://groups.google.com/d/topic/django-developers/Pg14uYSYwVk/discussion
[6]
https://groups.google.com/d/topic/django-developers/UGwIoJUgWTw/discussion
[7]
https://groups.google.com/d/topic/django-developers/9bpt8EAKnFc/discussion
[8]
https://groups.google.com/d/topic/django-developers/RihIecNxpKE/discussion
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/django-developers/-/FzSFxeUk3BwJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.