Konstantin Kolinko wrote:
..

2011/5/9 André Warnier <a...@ice-sa.com>:
(like a space encoded as a "+", and a "+"
encoded as %xy),

Andre, one small correction:
It sometimes causes confusion, but encoding of space as '+' works only
in the query part of the URL.
The unambiguous way to encode a space regardless of is position in URL is %20.

Encoding space as '+' is defined by "url encoding" encoding scheme
defined by HTML standard, in the chapter where it describes how HTML
forms are submitted.

Agreed, my mistake.
Also, in the query string part, an unencoded ";" could be taken as a query parameter separator, no ? (an alternative to "&").
But I forget what RFC that is, if any.

Now one additional comment. You said :
..
> about SEOability and user-friendliness - this especially concerns path
> > with international characters in URLs, e.g. http://site/pathąčęė

That is up to the browser how to show those URLs. Many browsers have a
setting how to display such URLs.  E.g. try to browse non-English
Wikipedia for an example of i18n addresses.
..

I think that the above is a bit confusing.
The "site" (or hostname) part of the URL is submitted to a different encoding than the path part (/pathąčęė). The path part must be URL-encoded, but for the hostname part, what is used is "punycode", see http://en.wikipedia.org/wiki/Punycode.
Just another example of the current mess with character sets and encodings...

I guess one has to have a first or last name containing so-called "diacritic" characters to really appreciate these issues.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to