Michael Repucci wrote:

> But there are thousands of sites (e.g., message boards, blog
> sites, etc.) where users can post messages that contain HTML - as I'd
> like to do with my application - so I suspect that there must be
> relatively simple solution.

That depends on what you mean by "simple". The only
way to be sure that you're only allowing safe HTML is
to fully parse it as HTML and only allow through tags
of your choice. If you happen to have an HTML parsing
library available, then it might not be too complicated
to do this.

Another approach I've seen suggested is to first
HTML-escape everything, and then search for the
escaped versions of the tags you want to allow and
replace them by unescaped ones.

What you should NOT do is attempt to sanitize things
by using regular expressions to find "bad" things and
remove them. There are too many ways that such simple-
minded approaches can be circumvented, and you'll never
be sure you've thought of them all.

It's very easy to get this stuff wrong if you don't
approach it from the right direction. It's quite likely
that many of those "thousands of sites" are not as
secure as they think they are in this area.

-- 
Greg

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to