On Sun, 2007-11-18 at 12:04 +0300, Ivan Sagalaev wrote:
> Malcolm, first of all, I should apologies. I actually intended my letter 
> being 'funny' but after your answer I understand that it was just harsh 
> :-(. I'm sorry.

Fair enough. I misunderstood your intent. No hard feelings. :-)

[...]
> > So Django
> > notes that the input was a SafeData and the function is marked is_safe
> > and, thus, it calls mark_safe() on the result so that you don't have to
> > in your filter
> 
> This is my first misunderstanding. mark_safe seems trivial enough, why 
> not just use it instead of .is_safe=True on a filter?
> 
> Looking at this from template/__init__.py:
> 
>      if getattr(func, 'is_safe', False) and isinstance(obj, SafeData):
>          obj = mark_safe(new_obj)
> 
> they're essentially equivalent modulo type checking. Why doesn't 
> mark_safe do type checking itself?

I'm not sure I understand what you're suggesting. Here's another
explanation of what's going on in those two lines:

        - 'obj' is initially the data we pass to the filter. The thing
        that is being filtered.
        
        - 'new_obj' is what the filter returns.
        
        - now if 'obj' (the original input) was safe *and* the filter
        says that safe input will generate safe output (func.is_safe ==
        True), we can mark the output as safe.
        
It's not really easy to collapse all this into mark_safe() because the
here mark_safe() is acting on the new result based on the state of the
original object. So you'd end up having to pass two things to
mark_safe() in this isolated case.

If we didn't have is_safe, every filter that did some kind of string
manipulation such as input = intput + 'x' would need to end with lines
like

        if isinstance(orig_input, SafeData):
            result = mark_safe(result)
        return result
        
and they would have to remember to save the original input (or test its
type very early). So the 'is_safe' attribute is a way for filter authors
to say "I don't want to worry about marking this result safe if the
input is safe. I know I'm not introducing unsafe characters, so Django
can take care of that".

> 
> > Because you wouldn't be able to write a filter that worked correctly in
> > both auto-escaping and non-auto-escaping environments, which is a
> > compulsory requirement in most cases. You don't want to escape inside
> > the filter if the current context doesn't have auto-escaping in effect.
> 
> Uhm.. This is the second thing I'm missing. I though that {% autoescape 
> off %} is a backward-compat measure. So .needs_autoescape exists only 
> for filters that used to do non-safe output and should behave as such in 
> a non-autoescaped environment. And I thought that in a new era all new 
> filters and tags actually should *always* return safe values. No?

Not really. Okay, three cases to think about here. Firstly, there are
some people who have deep objections to auto-escaping for various
reasons. They want to be able to turn it off and never use it.
Apparently their code never contains any bugs and we're just slowing
them down. :-)

More seriously (case #2), the conversion from Django 0.96 to
accommodating auto-escaping is not entirely trivial. Jeremy Dunck has
estimated about a week's worth of time for him to port some of the stuff
he maintains (which I'm guessing includes the Ellington instance he
works with). For large projects, they might be running with
auto-escaping off for quite a while yet. This might not affect your
use-cases, but there are going to be some people writing applications
intended to be used by the general unknown public and, in those cases,
writing to be able to work in both situations will be good practice.

Finally, auto-escaping is only appropriate for HTML text. You don't want
it on in templates that generate email, or text documents, or even
Javascript fragments (you'd be amazed at how poorly "if (2 < 3)"
works in Javascript). So there will be quite legitimate cases when you
want to wrap entire blocks of output in "{% autoescape off %}...{%
endautoescape %}" sections. However, some of your filters might still be
useful in those sorts of sections. Imagine, for example, a filter that
always replaced the word "and" by "&". It will need to behave
differently in different auto-escaping contexts (use "&" in HTML
templates, and "&" in email). If you let Django handle the autoescaping
by doing nothing to your output, that's fine. It'll work. But if you
also need to add raw HTML, as in your examples, you need to know when to
escape things yourself. Hence "needs_autoescape".

> > I'll have one more pass at it and after that I look forwards to reading
> > your patch to improve things.
> 
> I will certainly try to do this.

I've rewritten most of the filtering and auto-escaping section (in
[6692]). Have a read of it and see if it makes more sense from the point
of view of where you were 24 hours ago. I've tried to approach it from a
different direction, hopefully motivating things a bit more without
getting us bogged down in unimportant details.

> >> For example. I'm writing a filter that gets a string and wraps it's 
> >> first letter in a <b>...</b>. I'm going to split the first letter, 
> >> conditional_escape the letter and the rest, wrap a letter in <b>...</b>, 
> >> concatenate and mark_safe. Now, should I stick .is_safe?
> > 
> > If you're always returning a safe string, then adding is_safe is a
> > no-op.
> 
> Yes, but *should* I always return a safe string? I believe in my case I 
> really should because I'm returning some HTML and nothing after my 
> filter could magically decipher it and escape parts of the string that I 
> didn't escape. Right?

That's correct.

> If yes, does it mean that I should use .need_autoescape to know if my 
> input was already escaped manually (if autoescape is None) or I should 
> do it myself (is autoescape == True)?

Well if autoescape == False in such a method, you should do *no*
escaping of your output, since it's being used in, e.g., an email or
Javascript or something. As an aside, the reason I chose autoescape=None
as the default there was so that filters could be written that worked
with Django 0.96 (if autoescape is None, you are using a pre-autoescape
version of Django and can conditionally import mark_safe() and friends
only if autoescape == True). That was a subtle trick 15 months ago that
I possibly should have removed in the final version, but it does no real
harm.

If autoescape == True, you should escape all data that isn't already
marked as safe. There is a function
django.utils.html.conditional_escape() that makes this easier. It's like
escape() except it doesn't do anything on SafeData instances. I forgot
to document conditional_escape() earlier, but it's in the new version.

For an example of how all this pulls together, see either the new
example in the docs (which looks a lot like your example) or see, say,
the linebreaks filter in django.template.defaultfilters, which is a
perfect example of something that is introducing HTML into safe or
unsafe input data. Under no circumstances look at urlize for an example
of how to handle mixed content. It gives *me* nose bleeds
(unsurprisingly, it's the one I've screwed up the most so far).

Hopefully this clears up some of your questions. As I said, I've tried
again with the documentation. I'm going to leave it alone for a while
now and let the madding crowds file patches for a bit (and let Adrian
sharpen his blue pencil and go to work editing it).

Regards,
Malcolm

-- 
The sooner you fall behind, the more time you'll have to catch up. 
http://www.pointy-stick.com/blog/


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to