Wietse Venema wrote:
> M. Sokolewicz:
>> (Wietse Venema) wrote:
>>> laurent jouanneau:
>>>> (Wietse Venema) wrote:
>>>>> To give an idea of the functionality, consider the following program
>>>>> with an obvious HTML injection bug:
>>>>>
>>>>>     <?php
>>>>>     $username = $_GET['username'];
>>>>>     echo "Welcome back, $username\n";
>>>>>     ?>
>>>>>
>>>>> With default .ini settings, this program does exactly what the
>>>>> programmer wrote: it echos the contents of the username request
>>>>> attribute, including all the malicious HTML code that an attacker
>>>>> may have supplied along with it.
>>>>>
>>>>> When I change one .ini setting:
>>>>>
>>>>>     taint_error_level = E_WARNING
>>>>>
>>>>> the program produces the same output, but it also produces a warning:
>>>>>
>>>>>     Warning: echo(): Argument contains data that is not converted
>>>>>     with htmlspecialchars() or htmlentities() in /path/to/script
>>>>>     on line 3
>>>> A PHP application doesn't always generate HTML : it can generate JSON, 
>>>> CSV, PDF etc.. In this case, we don't have to call htmlspecialchars etc..
>>> In that case, I suppose you would not be using echo, so there
>>> is no problem. 
>>>
>> You wouldn't? So, when outputting a script-generated pdf file, how would 
>> you do that if not using echo? (and thus also not print since that's 
>> pretty much the exact same thing)
> 
> Never mind.
> 
> The code that creates PDF produces data that carries no taint
> labels. The taint labels don't jump into existence spontaneously,
> they have to be added by whoever creates that data.
> 
> Thus, because PDF is created without taint labels, the echo operator
> will not object to echoing it.
> 
>>>> Is this warning appearing also when you want to output datas other than 
>>>> HTML ? If no, how your code guess the output type ? If yes, how can we 
>>>> disable this warning in pages which produce JSON etc. ?
> 
> Guessing is something that we do in inferior software.
> 
> Data carries taint labels that say what can't be done with it. If
> the PDF creator etc. does not label its output, then there are no
> restrictions.

This doesn't make much sense to me.

Consider very common (abbreviated) code like this:

$user_data = $_REQUEST['data'];
switch($output_format) {
  case 'html':
    echo "<html>$user_data</html>";
    break;
  case 'xml':
    header('Content-type: text/xml');
    echo "<xml>$user_data</xml>";
    break;
  case 'json':
    header('Content-type: application/json');
    echo json_encode(array($user_data));
    break;
}

$user_data is going to be tainted, but the untainting rules are very
different for those 3 cases and popping up an error that talks about
html escaping only makes sense in the html case.  That's part of what I
was talking about months ago when I talked about the problem with
context-less tainting.

-Rasmus

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to