On Tue, Feb 16, 2016 at 9:29 AM, Evgeny Grin <[email protected]> wrote:

> There is a conflict between standards.
> HTML states that "+" must be decoded to space in url-encoding.
> https://www.w3.org/TR/html/forms.html#url-encoded-form-data
> RFC 3986 doesn't assume any special treatment of "+".
> https://tools.ietf.org/html/rfc3986


I found a draft talking "... and the plus sign may be used to represent
space characters.":

https://tools.ietf.org/html/draft-hoehrmann-urlencoded-01

But I don't know if this draft was accepted as standard.

However, I did some tests with PHP and JS, that shows MHD_http_unescape()
is right:

<?php
  $s = "Silvio Clécio";
  $enc = urlencode($s);
  $rawenc = rawurlencode($s);
  echo "enc: " . $enc . "<br />";
  echo "rawenc: " . $rawenc. "<br />";
  echo "<br />";
  echo "urldecode(enc): " . urldecode($enc) . "<br />";
  echo "urldecode(rawenc): " . urldecode($rawenc) . "<br />";
  echo "<br />";
  echo "rawurldecode(enc): " . rawurldecode($enc). "<br />";
  echo "rawurldecode(rawenc): " . rawurldecode($rawenc);
?>

Result:

enc: Silvio+Cl%C3%A9cio
rawenc: Silvio%20Cl%C3%A9cio

urldecode(enc): Silvio Clécio
urldecode(rawenc): Silvio Clécio

rawurldecode(enc): Silvio+Clécio
rawurldecode(rawenc): Silvio Clécio

---

var s = "Silvio Clécio";
var enc = encodeURI(s);
var encCmp = encodeURIComponent(s);
console.log("enc: ", enc);
console.log("encCmp: ", encCmp);
console.log();
console.log("decodeURI(enc): ", decodeURI(enc));
console.log("decodeURI(encCmp): ", decodeURI(encCmp));
console.log();
console.log("decodeURIComponent(enc): ", decodeURIComponent(enc));
console.log("decodeURIComponent(encCmp): ", decodeURIComponent(encCmp));

Result:

enc:  Silvio%20Cl%EF%BF%BDcio
encCmp:  Silvio%20Cl%EF%BF%BDcio

decodeURI(enc):  Silvio Clécio
decodeURI(encCmp):  Silvio Clécio

decodeURIComponent(enc):  Silvio Clécio
decodeURIComponent(encCmp):  Silvio Clécio

And:

var s = "Silvio+Cl%C3%A9cio";
console.log("decodeURI(s): ", decodeURI(s));
console.log("decodeURIComponent(s): ", decodeURIComponent(s));

Result:

decodeURI(s):  Silvio+Clécio
decodeURIComponent(s):  Silvio+Clécio

So MHD_http_unescape() looks like rawurldecode() and decodeURI()/
decodeURIComponent(). However I need to replace the plus chars because I
don't know what format the end user will send. I going to try
MHD_OPTION_UNESCAPE_CALLBACK ...

Thanks for answer -CG and -EG! :-)

--
Silvio Clécio

Reply via email to