Re: [OAUTH-WG] Preliminary OAuth Core draft -29

Julian Reschke Thu, 12 Jul 2012 01:32:13 -0700

On 2012-07-09 17:01, Julian Reschke wrote:

On 2012-07-09 16:48, Mike Jones wrote:

HTML5 is not cited because it's a working draft - not an approved
standard.  In what way is "the definition of the media type in HTML4
is known to be insufficient"?  People have been successfully
implementing form-urlencoding with it for quite some time. :-)  Is
there a specific wording change that you'd suggest that we make that
doesn't involve citing a working draft, rather than an approved standard?


For instance, the HTML4 "definition" doesn't even mention what to do
with non-ASCII characters.

I understand that it's not particularly attractive, but citing HTML4
just because it's a "standard" isn't really helpful for people who
actually follow the link and try to understand what needs to be
implemented.
...

Here's an attempt to describe the encoding in terms of HTML4, plusadditional instruction. This would need to be referenced anyway wherethe spec currently refers to the HTML4 media type definition:


-- snip --
Appendix X. Use of the application/x-www-form-urlencoded Media Type

At the time of publication of this specification, the"application/x-www-form-urlencoded" media type was defined in Section17.13.4 of [HTML4], but not registered in the IANA media types registry(<http://www.iana.org/assignments/media-types/index.html>). Furthermore,the definition is incomplete as it does not consider non-US-ASCIIcharacters.

To address this shortcoming, when generating payloads using this mediatype, names and values MUST be encoded using the "UTF-8" characterencoding scheme ([RFC3629]) first; the resulting octet sequence thenneeds to be further encoded using the escaping rules defined in [HTML4].

When parsing data from a payload using this media type, the names andvalues resulting from reversing the name/value encoding consequentlyneed to be treated as octet sequences, to be decoded using the "UTF-8"character encoding scheme.

Example: A value consisting of the six Unicode code points (1) U+0020(SPACE), (2) U+0025 (PERCENT SIGN), (3) U+0026 (AMPERSAND), (4) U+002B(PLUS SIGN), (5) U+00A3 (POUND SIGN), and (6) U+20AC (EURO SIGN) wouldbe encoded into the octet sequence below (using hexadecimal notation):


  20 25 26 2B C2 A3 E2 82 AC

and then represented in the payload as:

  +%25%26%2B%C2%A3%E2%82%AC

-- snip --

Best regards, Julian
_______________________________________________
OAuth mailing list
OAuth@ietf.org
https://www.ietf.org/mailman/listinfo/oauth

Re: [OAUTH-WG] Preliminary OAuth Core draft -29

Reply via email to