Paul B. Gallagher wrote:
[email protected] wrote:

Paul B. Gallagher wrote:

So my question is, why do you need the display to be garbled instead of easily readable? Do you really want to try to parse <https://ko.wikipedia.org/wiki/%ED%99%A9%EC%A0%95%EC%9D%8C> manually instead of letting the program do it?

Depends what you want to do with it. If you need to insert that text somewhere which expects a proper URL, you need the percent encoding.

Depends on your definition of "proper URL." Most modern programs (including SeaMonkey) can handle non-ASCII URLs just fine, but most older ones require percent encoding. I still make a habit of enclosing URLs in angle brackets because there's a large installed base of older programs that will misinterpret spaces (and a few other things) as "end of URL." Percent encoding prevents that, of course.

By "proper URL" I mean one which conforms to the specification of a URL (or rather a URI to be strict) - RFC 3986. Many modern applications display percent-encoded sequences in the URL as the corresponding characters, and perform the reverse conversion on text entered by the user - but those representations are not really URLs.

<https://ko.wikipedia.org/wiki/%ED%99%A9%EC%A0%95%EC%9D%8C> is a URL.

<https://ko.wikipedia.org/wiki/황정음> is not a URL - it's what you get by replacing the percent-encoded sequences from the above URL with the characters represented by the corresponding UTF-8 sequences. It may look similar to a URL, and SeaMonkey (and may other applications) may display it, allow you to enter it, and handle converting it into <https://ko.wikipedia.org/wiki/%ED%99%A9%EC%A0%95%EC%9D%8C> where an actual URL is needed, but it's not a URL.

One day, none of this will matter because all software everywhere will use the full Unicode character set and human-readable IRIs will be the norm. For now, workarounds abound. ;-\

As far as I can tell, IRIs (RFC 3987) still require spaces to be percent-encoded as %20, but they do at least allow things like <scheme://ko.wikipedia.org/wiki/황정음> without percent-encoding the non-ASCII characters.

Also as far as I can tell, IRIs are not used by current versions of HTTP (not even HTTP 2), which are still defined in terms of URIs. That doesn't prohibit web browsers from displaying things like <https://ko.wikipedia.org/wiki/황정음> as an address and handling the conversions - but for presenting that address to other applications of unknown capabilities (whether that's copy/paste or inserting X's primary selection) I'd expect it to give the actual URI (or part of the URI if only part of the address is copied/selected).

--
Mark.

_______________________________________________
support-seamonkey mailing list
[email protected]
https://lists.mozilla.org/listinfo/support-seamonkey

Reply via email to