On Fri, Aug 05, 2022 at 06:29:45PM +0100, Gavin Smith wrote:
> > 
> > To me the question is not the locales of the browser, but the encoding
> > of the HTML file.  If the encoding is ISO latin 1 as in:
> > <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
> > 
> > Then it seems to me that the URI::Escape call should be on a ISO latin 1
> > encoded string.  But I am not sure.
> 
> I don't think so.  Such encoded strings are not recommended by anybody.
> I think it's simpler to use the usual URL encoding of either straight
> ASCII or percent encoded UTF-8.
> 
> I don't see why the encoding of the HTML file itself should make a difference.
> I tested it and didn't find the HTML encoding declaration made a difference
> to percent encoded links (on Chromium 97).  I've attached the example files.
> 
> Regardless of the declaration, the encoded bytes were used for the filename.

Same with firefox.  So, utf-8 everywhere, actually simpler.

> There was one difference, which shows that percent encoding links is a good
> idea.  In test-latin1.html (attached), the uncoded link does not work - in
> the file it is "ä.html", but Chromium looks for a file named "ä.html" on
> the filesystem, presumably due to decoding it from Latin-1 and then reencoding
> to UTF-8.

It is because you have a file that is actually encoded in UTF-8 that you
declare being latin1.  If converted to latin1, it works well to open the
UTF-8 encoded file.

-- 
Pat

Reply via email to