Re: PDD 4: Internal data types

2001-03-22 Thread Simon Cozens
On Tue, Mar 06, 2001 at 01:21:20PM -0800, Hong Zhang wrote: > The normalization has something to do with encoding. If you compare two > strings with the same encoding, of course you don't have to care about it. Of course you do. Think about it. If I'm comparing "(Greek letter lower case alpha wi

Re: PDD 4: Internal data types

2001-03-22 Thread Hong Zhang
> > The normalization has something to do with encoding. If you compare two > > strings with the same encoding, of course you don't have to care about it. > > Of course you do. Think about it. I said "you don't have to". You can use "==" for codepoint comparison, and something like "Normalizer.co

Re: PDD 4: Internal data types

2001-03-22 Thread Buddha Buck
At 11:14 AM 03-22-2001 -0800, Hong Zhang wrote: >Please not fight on wording. For most encodings I know of, the concept of >normalization does not even exist. What is your definition of normalization? To me, the usual definition of "normalization' is conversion of something into a standard form

Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Karl M. Hegbloom
I've not researched this at all... perhaps it's a "known" way of doing things and there is research writing out there already, etc... I've not even looked at this point. I have about 30 minutes to outline this and bounce it off of you all this morning. 8-) I was reading Lincoln D. Stein's

Re: PDD 4: Internal data types

2001-03-22 Thread Simon Cozens
On Thu, Mar 22, 2001 at 11:14:53AM -0800, Hong Zhang wrote: > Please not fight on wording. For most encodings I know of, the concept of > normalization does not even exist. *boggle*. I don't think we're talking about the same Unicode. > What is your definition of normalization? Well, either ca

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Hong Zhang
Here is some of my experience with HotSpot for Linux port. > I've read, in the glibc info manuals, the the similar situation > exists in C programming -- you don't want to do a lot inside the > signal handler; just set a flag and return, then check that flag from > your main loop, and run a "

Unicode handling

2001-03-22 Thread Dan Sugalski
At the moment, I'm not particularly inclined to argue unicode. Short of Larry handing down an edict and invoking Rule #1, the following rules will be in effect: 1) All Unicode data perl does regular expressions against will be in Normalization Form C, except for... 2) Regexes tagged to run aga

Re: Unicode handling

2001-03-22 Thread Hong Zhang
> 6) There will be a glyph boundary/non-glyph boundary pair of regex > characters to match the word/non-word boundary ones we already have. (While > I'd personally like \g and \G, that won't work as \G is already taken) > > I also realize that the decomposition flag on regexes would mean that > s/

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread John Harper
Hong Zhang writes: |> I've looked, a little, (and months ago at that) at the LibREP (ala |> "sawfish") virtual machine. It's a pretty good indirect threaded VM |> that uses techniques pioneered by Forth engines. It utilizes the GCC |> ability to take the address of a label to build a jump ta

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Karl M. Hegbloom
> "Hong" == Hong Zhang <[EMAIL PROTECTED]> writes: >> What if, at the C level, you had a signal handler that sets or >> increments a flag or counter, stuffs a struct with information about >> the signal's context, then pushes (by "push", I mean "(cons v ls)", >> not "(append!

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Hong Zhang
> >> What if, at the C level, you had a signal handler that sets or > >> increments a flag or counter, stuffs a struct with information about > >> the signal's context, then pushes (by "push", I mean "(cons v ls)", > >> not "(append! ls v)" 'whatever ;-) that struct on a stack... >

Re: Unicode handling

2001-03-22 Thread Nicholas Clark
On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > 1) All Unicode data perl does regular expressions against will be in > Normalization Form C, except for... > 2) Regexes tagged to run against a decomposed form will instead be run > against data in Normalization Form D. (What the ta

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Keisuke Nishida
At Thu, 22 Mar 2001 13:37:29 -0800, John Harper wrote: > > |> I've looked, a little, (and months ago at that) at the LibREP (ala > |> "sawfish") virtual machine. It's a pretty good indirect threaded VM > |> that uses techniques pioneered by Forth engines. It utilizes the GCC > |> ability to