Nice find - I did pretty much the same thing, but using lxml.Cleaner.
This seems more configurable; I'm probably going to change mine over
to this instead.

Generally the rule with script injection is to scrub and filter on
output, because that's the last line of defense. However, for
situations like this, if you know the cleaned HTML content will not be
tampered with elsewhere or on output, I think it should be acceptable.

You can of course only scrub and filter on output instead, at the
expense of performance. That would be the safest option.

Personally, I did the former - scrub on input, and just retrieve the
cleaned string and mark it as safe on output.

On 30 June 2010 19:38, Tor Nordam <tor.nor...@gmail.com> wrote:
> I'm developing a blog application in django, and I've been looking
> into ways to clean the input which will allow safe html tags, while
> removing all the evil stuff. I came across the tool bleach (
> http://github.com/jsocol/bleach ), which seems to be easy to use.
>
> I was just wondering if anyone has any experience or advice to offer.
> Also, it seems to me that the way to go about this is to filter the
> text with bleach upon input, and then store the cleaned text in the
> database, marking it as safe upon output. Is that the correct way to
> do this?
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To post to this group, send email to django-us...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-users?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to