On Sun, Jul 1, 2012 at 7:00 PM, Anthony J. Bentley
<anthonyjbent...@gmail.com> wrote:
> ropers writes:
>> This diff fixes things:
>>
>> --- bsdcan11-mandoc-openbsd.html      2012-06-30 22:18:52.000000000 +0200
>> +++ bsdcan11-mandoc-openbsd.html.newentities  2012-06-30 22:34:58.000000000
>> +0200
>> @@ -13,7 +13,7 @@
>>
>>  <p><a href="http://www.flickr.com/photos/tomkoadam/4778126822/";><img
>>  src="http://farm5.static.flickr.com/4115/4778126822_555b453a1e.jpg";></a></p>
>> -<p>Csiko - Foal. - Photo: Adam Tomko @flickr (CC)</p>
>> +<p>Csik&oacute; - Foal. - Photo: Adam Tomk&oacute; @flickr (CC)</p>
>>
>>  <HR>
>>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 2: INTRO I -
>> @@ -725,7 +725,7 @@
>>  <HR>
>>  <P>Ingo Schwarze: Mandoc in OpenBSD - page 22: RECURRING II -
>>  BSDCan 2011, May 13, Ottawa</P>
>> -<H1>Bogue deja vue:</H1>
>> +<H1>Bogue d&eacute;j&agrave; vue:</H1>
>>  <H2>Collecting regression tests.</H2>
>>  <UL>
>>  <LI>Slow start in 2009:
>>
>> That's it. That's all.
>
> The advantage of using pure ASCII plus HTML escapes in a page is that it
> displays the correct content regardless of declared character encoding.
> The disadvantage is that it means adding escapes *everywhere*. Can you
> imagine writing http://www.openbsd.org/cs/ in anything but native UTF-8?
> At some point we have to pick an encoding and stick with it.

UTF-8 is used because of better supported "standard" and to ease
interoperability. There is ISO-8859-2 which I started with, but I
found during years that for our language is very poor support in
browsers and it doesn't improve with time. A lot of fights on browser
level because of "default" Windows-1250 used here. UTF-8 turned to be
best option for this. It works in Symbian, IE, FF, Opera, Chrome, Lynx
(special characters not complete with some characters, but just eg.
punctation missing so perfectly readable in our language [as we don't
have them in SMS for example anyway]). I'm doing that in OpenBSD and
there is not Czech keyboard in base system even as patch was sent
before two years, but Czech keyboard is available in X via setxkbmap
which is fine for me. As it's UTF-8 including files I'm doing that
from gtk+ vim as no other tool from base support UTF-8.

>
>> So again, the complaint was that there was mojibake gibberish in
>> Ingo's presentation, because the character encoding isn't specified
>> but defaults to UTF-8 in modern browsers, while the page is actually
>> iso-8859-1 encoded.
>
> Actually, "modern" browsers do not default to a particular encoding (in
> fact, this violates the HTML standard). Instead, they attempt to autodetect
> the charset. Sometimes this works, and sometimes it doesn't -- I've seen
> UTF-8 pages incorrectly detected as ISO-8859-1, and in particularly bad
> cases, vice versa.
>
>> There were many objection to a simple addition of <HEAD><META
>> http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
>> /><HEAD/> as a fix.
>
> Yes, this is pretty ugly. But the only alternative is using one encoding
> everywhere and setting the appropriate HTTP header instead of an HTML
> meta tag. Actually, that's not a bad idea, but it means using UTF-8 on all
> pages, since that's the only encoding that can handle the different
> translations on the OpenBSD website. It would also require removing or
> altering meta tags on all pages (but considering the alternative is *adding*
> meta tags to all pages...).
>
>> But then I thought, what about browsers that don't support UTF-8 yet;
>> this is going to break things for them.
>
> I challenge you to find a single browser in ports that doesn't. IE6
> supports UTF-8 properly. Even Lynx works fine when the user has a UTF-8
> locale. (And ISO-8859-* are also locale-dependent, so this is not any
> worse.)
>
>
> So, in summary, the options are:
>
> Use HTML escapes everywhere. IMO, highly impractical.
>
> Use any encoding you wish, and set a meta tag when appropriate. This is
> basically what we have now. (The front pages of /, /de/, /fr/ all use
> ISO-8859-1; /cs/ uses UTF-8; /lt/ uses ISO-8859-13.)
>
> Use UTF-8 everywhere, and enforce this either with an HTTP header or
> meta tags.
>
> --
> Anthony J. Bentley

Reply via email to