Ok, sorry about the initial confusion. I thought we emitted %20 and you were proposing to emit +. So I was basically arguing in favor of your change except I didn't know it. :)
Here is my hopefully correct e-reply: encode_www_form (which is what is used by encode_query) is not specified by RFC3986. So at best, we need to clarify that these functions are not part of RFC3986. RFC3986 does allow + in query strings but it does not say anything about encoding/decoding it. The query string is ultimately up to the interpretation of the underlying application. Even the common key=value mechanism is hinted but not asserted. For example, someone could use a query string where the meaning of & and = are replaced and that's fine. So while encode_query may not violate RFC3986, it definitely doesn't follow RFC3986. Encoding spaces to %20 would definitely be a better take. To make matters worse, we cannot simply replace URI.encode_www_form by URI.encode because URI.encode does not handle #, which MUST be escaped in query strings. I think the best option at the moment is to deprecate encode_query and introduce encode_www_query. We should also introduce a new function that encodes it according to the RFC interpretation of query strings but I am not sure what to call it. Suggestions? On Fri, Jan 29, 2021 at 4:27 PM José Valim <[email protected]> wrote: > Gah, I am so sorry. I have been working on the wrong assumption that > URI.encode_query was escaping space to %20 but it is encoding it to +, > which was your point all along. Yes, escaping it to + is not in > accordance to RFC3986. > > I will re-read your original e-mail and address it accordingly now. Once > again, apologies. > > > On Fri, Jan 29, 2021 at 4:13 PM José Valim <[email protected]> wrote: > >> > If I read this correctly, than given what you write, the current >> `URI.encode_query/1` implementation _is_ in violation of RFC3986. Example: >> >> You can’t compare the result of URI.encode with URI.encode_query because >> they are meant to escape different parts of an URI and different parts use >> different rules. They are both in accordance to the RFC though. >> >> The assumption that all of a URL needs to be escaped with URI.encode is >> incorrect. if we encode a query parameter with URI.encode, that will be >> the wrong result. >> >> Plus, the Wikipedia article explicitly mentions that escaping space as + >> is a difference to RFC3986, which confirms my assumption that RFC3986 does >> not mention whitespace *in query params* should be encoded as +: >> >> > The encoding of SPACE as '+' and the selection of "as-is" characters >> distinguishes this encoding from RFC 3986 >> <https://tools.ietf.org/html/rfc3986>. >> > -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BviGdPw82X_KXghXNY7J7Q0yA%2BK2PepcQH_%3DTDnhmcYQ%40mail.gmail.com.
