Yes, you can escape both a and b such that it works in either context.

Reference rule #1 and #2 on
http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.231_-_HTML_Escape_Before_Inserting_Untrusted_Data_into_HTML_Element_Content

<http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.231_-_HTML_Escape_Before_Inserting_Untrusted_Data_into_HTML_Element_Content>Rule
#1 states that data inserted into HTML element content (variable b in your
example) should escape these 6 characters:

& --> &amp;
< --> &lt;
> --> &gt;
" --> &quot;
' --> &#x27; &apos; is not recommended
/ --> &#x2F; forward slash is included as it helps end an HTML entity

Though many implementations leave off the forward slash. This encoding would
escape variable b enough to be put into the <div> body context.

Variable a is being inserted into an HTML attribute. As per rule #2...

"Except for alphanumeric characters, escape all characters with ASCII values
less than 256 with the &#xHH; format (or a named entity if available) to
prevent switching out of the attribute. The reason this rule is so broad is
that developers frequently leave attributes unquoted. Properly quoted
attributes can only be escaped with the corresponding quote. Unquoted
attributes can be broken out of with many characters, including [space] % *
+ , - / ; < = > ^ and |."

I have spoke to the author of this text, and he indicates this is *
overencoding.* The overencoding is necessary because developers often leave
attributes unquoted. If the attribute is quoted, the only way to break out
the quoted context is with the corresponding quote. For a few reasons
dealing with the sequence of parsers, we highly advise encoding all 6 of the
characters above. This is the same routine as above for rule #1. This makes
data safe for inclusion in *quoted *HTML attributes but NOT in *unquoted* HTML
attributes. Thus, there are warnings on the pythonsecurity.org wiki that we
would like to see in template engine documentation that indicate developers
should *always* quote HTML attributes. Or, if you wish to be super-safe and
do the best thing possible, follow the quoted advice above and encode all
non-alphanumerics below 256.

I hope this sufficiently answers your question. In practice, an escaping
routine escaping all 6 of the characters above should be created and used
for any variables handled by the template engine unless marked as safe.

Best,
Craig

On Wed, Jul 14, 2010 at 12:25 PM, mdipierro <mdipie...@cs.depaul.edu> wrote:

> here is the problem as I see it
>
> #controller
> def index(): return dict(a=' x"y ', b=' x"y ')
>
> #view
> <div onclick="{{=a}}">{{=b}}</div>
>
> Notice that a and b have the same value. a should be escaped as x\"y
> while this escaping would be wrong for b.
> Are you telling me there is a way to escape both a and b that works in
> both way whatever the context?
> If there is I do not know about it.
>
> Massimo
>
>
> On 14 Lug, 09:52, Craig Younkins <cyounk...@gmail.com> wrote:
> > I want to re-raise this issue because I feel it is important.
> >
> > > > * Do not use cgi.escape for HTML escaping because it does not escape
> > > > single quotes and may lead to XSS - See
> >
> > http://www.pythonsecurity.org/wiki/web2py/#cross-site-scripting-xss
> > <http://www.pythonsecurity.org/wiki/web2py/#cross-site-scripting-xss>
> >
> > > > and  http://www.pythonsecurity.org/wiki/cgi/<
> http://www.google.com/url?sa=D&q=http://www.pythonsecurity.org/wiki/c...>
> > > I assume you refer to attribute escaping. When using helpers like
> >
> >  > {{=A(link,_href=url)}} then link is escaped using cgi.escape but url
> >
> > > is escaped differently (quotes are escaped). The problem is that the
> > > escape function does not know whether a variable is to be inserted in
> > > html, css, js, attribute, a string in js, etc. etc. and therefore if
> > > the function does know the context it is in it can never always escape
> > > correcly. I do not believe there is a general solution to this
> > > problem. web2py assumes {{=....}} is escaping HTML/XML. If you need to
> > > scape attributes we suggest using helpers.  If you need to scape js
> > > code or strings in js code, you may have to do it manually.
> >
> > That's not quite what I was getting at. You're right about needing the
> > context in order to escape correctly though. I think the default escaping
> > should include single and double quotes. cgi.escape escapes double quotes
> > but not single quotes.
> >
> > I thought that the default escaping was going through cgi.escape by way
> of
> > the xmlescape method, but given the below, that appears to not be the
> case.
> > I'm a little confused.
> >
> > Here's an example of something I don't think I should be able to do:
> >
> > Controller:         return dict(data='" onload="alert(1);" bad="')
> > View:               <body class="{{=data}}"></body>
> > Output:            <body class="" onload="alert(1);" bad=""></body>
> >
> > The same attack works with single quoted attributes. While you're right,
> we
> > can't do full proper escaping without knowing the context, I don't think
> > quotes should be permitted in any web context.
> > --
> > Craig Younkins
>



-- 
Craig Younkins

Reply via email to