Wietse Venema wrote: > M. Sokolewicz: >> (Wietse Venema) wrote: >>> laurent jouanneau: >>>> (Wietse Venema) wrote: >>>>> To give an idea of the functionality, consider the following program >>>>> with an obvious HTML injection bug: >>>>> >>>>> <?php >>>>> $username = $_GET['username']; >>>>> echo "Welcome back, $username\n"; >>>>> ?> >>>>> >>>>> With default .ini settings, this program does exactly what the >>>>> programmer wrote: it echos the contents of the username request >>>>> attribute, including all the malicious HTML code that an attacker >>>>> may have supplied along with it. >>>>> >>>>> When I change one .ini setting: >>>>> >>>>> taint_error_level = E_WARNING >>>>> >>>>> the program produces the same output, but it also produces a warning: >>>>> >>>>> Warning: echo(): Argument contains data that is not converted >>>>> with htmlspecialchars() or htmlentities() in /path/to/script >>>>> on line 3 >>>> A PHP application doesn't always generate HTML : it can generate JSON, >>>> CSV, PDF etc.. In this case, we don't have to call htmlspecialchars etc.. >>> In that case, I suppose you would not be using echo, so there >>> is no problem. >>> >> You wouldn't? So, when outputting a script-generated pdf file, how would >> you do that if not using echo? (and thus also not print since that's >> pretty much the exact same thing) > > Never mind. > > The code that creates PDF produces data that carries no taint > labels. The taint labels don't jump into existence spontaneously, > they have to be added by whoever creates that data. > > Thus, because PDF is created without taint labels, the echo operator > will not object to echoing it. > >>>> Is this warning appearing also when you want to output datas other than >>>> HTML ? If no, how your code guess the output type ? If yes, how can we >>>> disable this warning in pages which produce JSON etc. ? > > Guessing is something that we do in inferior software. > > Data carries taint labels that say what can't be done with it. If > the PDF creator etc. does not label its output, then there are no > restrictions.
This doesn't make much sense to me. Consider very common (abbreviated) code like this: $user_data = $_REQUEST['data']; switch($output_format) { case 'html': echo "<html>$user_data</html>"; break; case 'xml': header('Content-type: text/xml'); echo "<xml>$user_data</xml>"; break; case 'json': header('Content-type: application/json'); echo json_encode(array($user_data)); break; } $user_data is going to be tainted, but the untainting rules are very different for those 3 cases and popping up an error that talks about html escaping only makes sense in the html case. That's part of what I was talking about months ago when I talked about the problem with context-less tainting. -Rasmus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php