On Oct 21, 9:17 pm, Jim Dalton <[email protected]> wrote:
> On Oct 21, 2011, at 8:04 AM, Kääriäinen Anssi wrote:
>
> > I do not know nearly enough about caching to participate fully in this
> > discussion. But it strikes me that the attempt to have CSRF protected
> > anonymous page cached is not that smart. If you have an anonymous
> > submittable form, why bother with CSRF protection? I mean, what is it
> > protecting against? Making complex arrangements in the caching layer for
> > this use case seems like wasted effort. Or am I missing something obvious?
>
> First issue is that CSRF can matter for anonymous users. From
> herehttp://www.squarefree.com/securitytips/web-developers.html#CSRF:
>
> Attacks can also be based on the victim's IP address rather than cookies:
>
> • Post an anonymous comment that is shown as coming from the victim's
> IP address.
> ...
> • Perform a distributed password-guessing attack without a botnet.
> (This assumes they have a way to tell whether the login succeeded, perhaps by
> submitting a second form that isn't protected against CSRF.)
>
> So two very common uses cases for anonymous forms are log in forms and
> anonymous comment forms, both of which are potentially vulnerable. I guess I
> feel like it's quite common to have forms on a page these days even for
> anonymous users.
>
> Second is -- and I don't know about this -- but I don't know how well CSRF
> handles authentication conditionally. Like if I have a page and let's say
> that page has forms in it for logged in users but nothing for anonymous user,
> can I conditionally exempt the formless page from CSRF? I have no idea, but
> buy default I presume it's on and I presume the cache is varying on it.
>
> So, yes, you could probably optimize a lot of this to sort of skip around the
> CSRF issue and it's not a deal breaker. But my main argument has been the
> ubiquity of CSRF + user authentication in Django projects to me means a
> solution to both of these is a requirement for page caching to become easy
> and applicable in most scenarios.
I can see how the above mentioned cases are useful, and as you say,
they probably are common in real world usage.
I took a different approach to phased template rendering in [https://
github.com/akaariai/django/tree/rewritable_content]. I hope it will
give some insight into solving the rewriting of already rendered
content containing csrf_token.
The idea is that template.render(context) returns a subclass of
SafeUnicode instead of just SafeUnicode. The subclass knows the
positions of rewritable parts of the content (csrf token values, for
example), and also how to rewrite those parts of the content. So, from
a template {% csrf_token %} you could get something like this back:
>>> rendered = tmpl.render(Context({'csrf_token': 'CSRF_TOKEN_VALUE'}))
>>> str(rendered)
<input type="hidden" value="CSRF_TOKEN_VALUE" /> # (pseudoish...)
>>> rendered.rewritable_parts
{'csrf_token': [(27, 42)]} # a dictionary of rewritable name -> list
of str positions where that block exists.
>>> rendered.rewrite({'csrf_token': 'NEW_VALUE'})
<input type="hidden" value="NEW_VALUE" />
There are some tests in the github branch. Those tests are the best
documentation currently available.
Rewritable rendered templates should be usable in automatic handling
of csrf_token when solving the caching problem. If you do no caching,
the user will get a normal response. If you do caching, then you will
need a hook to do the response.rewrite for the csrf_token in cache
fetching. This has been discussed already, and seems to be solvable.
The actual rewrite of the content would be easy, it is just
response.rewrite({'csrf_token': 'new_csrf_token_value'}). This way it
could be possible to cache pages containing csrf_token transparently
to the user.
The github branch also implements {% rewritable some_name %} {%
endrewritable %} tag, but as is it is not very usable. For example,
rewriting the login/logout part of the page would be much easier using
a real two-phase rendering implementation. The already mentioned
Jannis Leidel's django-phased seems to fit this task much better than
my hack.
As far as I can tell there isn't any large performance hit (actually,
using djangobench, I could not measure any difference). This might be
just a failure on my part, as that result is a bit surprising.
The biggest problem with the approach is that the csrf_token tag must
be rendered as part of nodelist. If it isn't the tracking of
start_pos,end_pos of the rewritable content will get out of sync. This
alone might be a show-stopper. I would not be surprised if there are
other non-solvable problems with the approach. All I know is that it
seems to work with include and block tags in simple templates.
The current implementation is just a quick hack. As said above, it is
possible, if not likely, that this approach is a dead-end.
- Anssi
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.