> -----Original Message-----
> From: Andrei Zmievski [mailto:[EMAIL PROTECTED] 
> Sent: 22 June 2006 22:46
> To: PHP Internals
> Cc: PHP I18N
> Subject: [PHP-DEV] RFC: Error handling in HTTP input decoding
> 
> I'd like to solicit opinions on how we should treat 
> conversion failures 
> during HTTP input decoding. There are two issues at hand: fallback 
> mechanism and application-driven decoding in case of failure. Let's 
> look at the proposal for the latter one first.
> 
> If the decoding of HTTP input fails (and the failure state would be 
> achieved as soon as even one variable fails), PHP should set an error 
> flag somewhere that is accessible to the user, via either a global 
> variable or a function. It should also keep the original request data 
> around (query string, POST body, and cookie data). The application 
> should be able to access this data, since the encoding can be 
> passed in 
> the query string [1]. The application can then check this error flag 
> and then call a function -- request_decode() perhaps -- to ask PHP to 
> re-decode the request data based on a this specific encoding. For 
> example:
> 
>    if (request_decoding_failed()) {
>       request_decode(request_get_raw('ei'));
>    }
> 
> We might be able to tie this in with the input filter, but that means 
> that the input filter will have to be required by PHP. I am open to 
> other suggestions in this area.
> 
> As for the first issue, PHP attempts to decode the input using the 
> value of the unicode.output_encoding setting, because that is 
> the most 
> logical choice if we assume that the clients send the data 
> back in the 
> encoding that the page with the form was in. We could implement a 
> fallback mechanism where PHP looks at the Accept-Charset 
> header sent by 
> the client[2]. This header is supposed to indicate what 
> character sets 

https://bugzilla.mozilla.org/show_bug.cgi?id=18643

Maybe of interest, it's the kludge for determining form charsets, after the
charset in the Content-Type header broke too much.

> are acceptable for the response. While this is not the same as 
> specifying the character set of the request, it might be a 
> good enough 
> indicator of it. Or we could simply set the error state and let 
> application figure out what charset it wants to use for decoding.
> 
> Thanks for your attention.
> 
> -Andrei
> 
> [1] http://search.yahoo.com/search?ei=UTF-8&p=php
> [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
> 
> -- 
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
> 

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to