On 18/09/12 21:06, Pádraic Brady wrote: > Hi Ángel, > > The methods all refer to literal strings, values or digits. We can't > reasonably escape data while allowing valid markup for the current > context since that's a contradiction by its very nature. If you needed > to let user values drive CSS names, Javascript functions or variable > naming, or HTML markup, you need something completely different. For > example, HTML markup can be sanitised against a whitelist using > HTMLPurifier. > >> I'm fine with the concept, but I'm not sold on the interface. >> It should be really clear when each of them should be used. >> >> escapeHtml() >> Ok, this is going to be used to show content inside a html document. >> >> escapeHtmlAttr() >> Use when using unquoted html attributes, otherwise use html escaping. >> When was the last time I saw an unquotted attribute with user-provided >> content? > Hopefully never since that's the ideal ;). However, HTML5 allows > unquoted attributes which is perfectly valid. We don't make the user's > choice on this but we could provide the relevant tool for escaping if > they are completely and irredeemably insane :P. Someone may be insane enough to try to destroy his planet, but "some insane soul might want it" is no reason to build such weapon. :)
As it's a crazy thing to do, we shouldn't provide means to do it. If your parameter is not a hardcoded number, just quote it and use escapeX function on its content. >> I think it should be replaced by a quoteHtmlAttr() function which properly >> escapes the content and adds the quotes for you (or it might skip them >> if it determines it's not needed in this case). > The RFC focuses on escaping - not sanitising or reformatting. As an api client I just want to pass a parameter to the attribute. Doing echo '<b style="' . escaper->escapeHtml("font-weight: normal") . '">'; or echo '<b style=' . escaper->quoteHtmlAttrib("font-weight: normal") . '>'; is equivalent, just a distinction on the function contract. But in the second case the function avoids the ambiguity on whether the attribute used double quotes, single ones or no quote at all, since it can choose the one it "prefers". The goal is to make easy to write secure code. I think the second way does it better. If we need to change the name of the rfc, so be it. >> escapeJs() >> Escape javascript... but inside <script> tags, I guess? So it's not to >> be used >> for dynamically generated javascript. Not so clear. > Javascript literal strings (as defined by the standard). Ok. We have the ' or " problem again, though. >> escapeCss() >> I'm not even sure in which cases would this be needed. Standalone CSS, >> inside >> a <style> tag, as style="" attribute? > CSS values like a font size or background color. If user data is > allowed to alter names or any other CSS markup, you would need > sanitisation rather than escaping. I was thinking in things like dynamic class names (I had no idea why you could want it, though :). It may be better named escapeCssValue() >> escapeUrl() >> "It is included primarily for consistency". When do I need to use >> escapeUrl and >> when escapeHtml? What if it's an url inside a css tag inside a html >> document? > Basically any URL inside any attribute. It encodes part of a URL - the > overall URL would still need to be validated separately. If it encodes *part of a url*, it's not for *any url*. By "any URL inside any attribute", I'd expect an usage like: echo '<a href="' . escaper->escapeHtml( escaper->escapeUrl( "https://wiki.php.net/rfc/escaper" ) ) . '">See the rfc</a>'; Of course, with the rawurlencode semantics, that https%3A%2F%2Fwiki.php.net%2Frfc%2Fescaper would be a relative url :) (passing a full url could be interesting for urlencoding non-ansi characters on the url, although most modern browsers deal fine with the raw bytes) >> It makes things more confusing, so I'd remove it. > Needs to be included to maintain consistency in having a full set of > go-to escapers. It could need renaming. >> It should be clear what you are passing to that function and in which >> context >> it expects you to leave the output. > It might not be obvious but these are very straightforward to link to > specific contexts. Here's the clearest explanation of where all of > this fits into templating: > https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet > > I should probably add that as a link to the RFC (Anthony will finally > get an ESAPI reference out of me ;)). > > Paddy That's a document worth reading by everyone, but I still think the functions of the methods should be clearer from their names. Regards -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php