On Tue, Mar 16, 2004 at 10:17:57PM +0100, Karl Brodowsky wrote:
: With FFFE and FEFF this seems obvious. In case of #! it would not be clear
: to me if this defaults to ISO-8859-1 (latin-1) or to utf-8. See HTML
: vs. XHTML as an example where the default has been changed.
Perl 6 would certainly
Dear All,
from what has been written by others, there are enough useful encodings other
than utf-8, utf-16/UCS-2 and UCS-4 that support efficient storage even
for unicode-files whose contents are Greek, Cyrillic, etc.. Sorry for the confusion
caused by the fact that I was not aware of these.
utf-
Karl Brodowsky wrote:
Mark J. Reed wrote:
The UTF-8 encoding is not so attractive in locales that make
heavy use of characters which require several bytes to encode therein, or
relatively little use of characters in the ASCII range;
utf-8 is fine for languages like German, Polish, Norwegian, Spanis
On 2004-03-16 at 00:28:32, Karl Brodowsky wrote:
> Mark J. Reed wrote:
>
> >Unicode per se doesn't do anything to file sizes; it's all in how you
> >encode it.
>
> Yes. And basically there are common ways to encode this: utf-8 and utf-16
> (or similar variants requiring >= 2 bytes per character)
Another possibility is to use a UTF-8 extended system where you use values over
0x10 to encode temporary code block swaps in the encoding. I.e.,
some magic value means the one byte UTF-8 codes now mean the Greek block
instead of the ASCII block. But you would need broad agreement for that t
At 11:36 PM + 3/15/04, [EMAIL PROTECTED] wrote:
Another possibility is to use a UTF-8 extended system where you use
values over 0x10 to encode temporary code block swaps in the
encoding. I.e.,
some magic value means the one byte UTF-8 codes now mean the Greek block
instead of the ASCII b
At 12:28 AM +0100 3/16/04, Karl Brodowsky wrote:
Anyway, it will be necessary to specify the encoding of unicode in
some way, which could possibly allow even to specify even some
non-unicode-charsets.
While I'll skip diving deeper into the swamp that is character sets
and encoding (I'm already up
Mark J. Reed wrote:
Unicode per se doesn't do anything to file sizes; it's all in how you
encode it.
Yes. And basically there are common ways to encode this: utf-8 and utf-16
(or similar variants requiring >= 2 bytes per character)
The UTF-8 encoding is not so attractive in locales that make
heav
On 2004-03-13 at 09:02:50, Karl Brodowsky wrote:
> For these guys Unicode is not so attractive, because it kind of doubles the
> size of their files,
Unicode per se doesn't do anything to file sizes; it's all in how you
encode it. The UTF-8 encoding is not so attractive in locales that make
heav