Re: Latin-1-characters

2004-03-16 Thread mark . a . biggar
Another possibility is to use a UTF-8 extended system where you use values over 0x10 to encode temporary code block swaps in the encoding. I.e., some magic value means the one byte UTF-8 codes now mean the Greek block instead of the ASCII block. But you would need broad agreement for that t

Re: Latin-1-characters

2004-03-16 Thread Mark J. Reed
On 2004-03-16 at 00:28:32, Karl Brodowsky wrote: > Mark J. Reed wrote: > > >Unicode per se doesn't do anything to file sizes; it's all in how you > >encode it. > > Yes. And basically there are common ways to encode this: utf-8 and utf-16 > (or similar variants requiring >= 2 bytes per character)

Re: Latin-1-characters

2004-03-16 Thread James Mastros
Karl Brodowsky wrote: Mark J. Reed wrote: The UTF-8 encoding is not so attractive in locales that make heavy use of characters which require several bytes to encode therein, or relatively little use of characters in the ASCII range; utf-8 is fine for languages like German, Polish, Norwegian, Spanis

Re: hash subscriptor

2004-03-16 Thread John Williams
On Mon, 15 Mar 2004, Larry Wall wrote: > On Mon, Mar 15, 2004 at 11:56:26AM -0700, John Williams wrote: > : I'm probably a bit behind on current thinking, but did %hash{bareword} > : lose the ability to assume the bareword is a constant string? > > It's thinking hard about doing that. :-) > > : An

Re: Latin-1-characters

2004-03-16 Thread Karl Brodowsky
Dear All, from what has been written by others, there are enough useful encodings other than utf-8, utf-16/UCS-2 and UCS-4 that support efficient storage even for unicode-files whose contents are Greek, Cyrillic, etc.. Sorry for the confusion caused by the fact that I was not aware of these. utf-

Re: Mutating methods

2004-03-16 Thread Larry Wall
On Tue, Mar 16, 2004 at 08:40:50PM +0200, arcadi shehter wrote: : How about <- which is not overloaded by boolean connotations : and is sort of ? turned by 90 degrees . Don't think so. It's too ambiguous with current meanings. : $topic<- (.a + .b + .c) That asks if $topic is numerically l

Re: Latin-1-characters

2004-03-16 Thread Larry Wall
On Tue, Mar 16, 2004 at 10:17:57PM +0100, Karl Brodowsky wrote: : With FFFE and FEFF this seems obvious. In case of #! it would not be clear : to me if this defaults to ISO-8859-1 (latin-1) or to utf-8. See HTML : vs. XHTML as an example where the default has been changed. Perl 6 would certainly

This week's summary

2004-03-16 Thread The Perl 6 Summarizer
The Perl 6 Summary for the week ending 2004-03-14 Another week, another summary. It's been a pretty active week so, with a cunningly mixed metaphor, we'll dive straight into the hive of activity that is perl6-internals. Benchmarking Discussion and development of Sebastien Riedel'