On Tue, Feb 16, 2016 at 9:29 AM, Evgeny Grin <[email protected]> wrote: > There is a conflict between standards. > HTML states that "+" must be decoded to space in url-encoding. > https://www.w3.org/TR/html/forms.html#url-encoded-form-data > RFC 3986 doesn't assume any special treatment of "+". > https://tools.ietf.org/html/rfc3986
I found a draft talking "... and the plus sign may be used to represent space characters.": https://tools.ietf.org/html/draft-hoehrmann-urlencoded-01 But I don't know if this draft was accepted as standard. However, I did some tests with PHP and JS, that shows MHD_http_unescape() is right: <?php $s = "Silvio Clécio"; $enc = urlencode($s); $rawenc = rawurlencode($s); echo "enc: " . $enc . "<br />"; echo "rawenc: " . $rawenc. "<br />"; echo "<br />"; echo "urldecode(enc): " . urldecode($enc) . "<br />"; echo "urldecode(rawenc): " . urldecode($rawenc) . "<br />"; echo "<br />"; echo "rawurldecode(enc): " . rawurldecode($enc). "<br />"; echo "rawurldecode(rawenc): " . rawurldecode($rawenc); ?> Result: enc: Silvio+Cl%C3%A9cio rawenc: Silvio%20Cl%C3%A9cio urldecode(enc): Silvio Clécio urldecode(rawenc): Silvio Clécio rawurldecode(enc): Silvio+Clécio rawurldecode(rawenc): Silvio Clécio --- var s = "Silvio Clécio"; var enc = encodeURI(s); var encCmp = encodeURIComponent(s); console.log("enc: ", enc); console.log("encCmp: ", encCmp); console.log(); console.log("decodeURI(enc): ", decodeURI(enc)); console.log("decodeURI(encCmp): ", decodeURI(encCmp)); console.log(); console.log("decodeURIComponent(enc): ", decodeURIComponent(enc)); console.log("decodeURIComponent(encCmp): ", decodeURIComponent(encCmp)); Result: enc: Silvio%20Cl%EF%BF%BDcio encCmp: Silvio%20Cl%EF%BF%BDcio decodeURI(enc): Silvio Clécio decodeURI(encCmp): Silvio Clécio decodeURIComponent(enc): Silvio Clécio decodeURIComponent(encCmp): Silvio Clécio And: var s = "Silvio+Cl%C3%A9cio"; console.log("decodeURI(s): ", decodeURI(s)); console.log("decodeURIComponent(s): ", decodeURIComponent(s)); Result: decodeURI(s): Silvio+Clécio decodeURIComponent(s): Silvio+Clécio So MHD_http_unescape() looks like rawurldecode() and decodeURI()/ decodeURIComponent(). However I need to replace the plus chars because I don't know what format the end user will send. I going to try MHD_OPTION_UNESCAPE_CALLBACK ... Thanks for answer -CG and -EG! :-) -- Silvio Clécio
