Re: Idea for safe signal handling by a byte code interpreter

2001-03-23 Thread Neil Jerram
> "Karl" == Karl M Hegbloom <[EMAIL PROTECTED]> writes: Karl> Then, from strategic points within the VM, just as the Karl> emacsen check for QUIT, you'd check for that signal flag or Karl> counter, and run the signal handlers from a bottom half of Karl> some kind. This way,

Re: Schwartzian Transform

2001-03-23 Thread James Mastros
On Thu, Mar 22, 2001 at 11:13:47PM -0500, John Porter wrote: > Brent Dax wrote: > > Someone else showed a very ugly syntax with an anonymous > > hash, and I was out to prove there was a prettier way to do it. > Do we want prettier? Or do we want more useful? > Perl is not exactly known for its pr

Re: Unicode handling

2001-03-23 Thread Jarkko Hietaniemi
On Fri, Mar 23, 2001 at 02:50:05PM -0500, Dan Sugalski wrote: > At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote: > > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: > > > > DS> U doesn't really signal "glyph" to me, but we are sort of limited > > DS> in what we have left. We still need a

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 11:26 PM 3/23/2001 +, Dave Mitchell wrote: >Dan Sugalski <[EMAIL PROTECTED]> doodled: > > At 11:09 PM 3/23/2001 +, Simon Cozens wrote: > > >For instance, chr() will produce Unicode codepoints. But you can > pretend that > > >they're ASCII codepoints, it's only the EBCDIC folk that'll g

RE: Unicode handling

2001-03-23 Thread Dan Sugalski
At 11:05 AM 3/23/2001 -0600, Garrett Goebel wrote: >From: Nicholas Clark [mailto:[EMAIL PROTECTED]] > > > > On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > > > 1) All Unicode data perl does regular expressions against > > >will be in Normalization Form C, except for... > > > 2

RE: Unicode handling

2001-03-23 Thread Dan Sugalski
At 01:26 PM 3/23/2001 -0500, NeonEdge wrote: >Dan Sugalski wrote: > >If we do, then something as simple as this: > > > > while () { > > $count++ if /bar/; > > print OUT $_; > > } > > > >would potentially result in the output file being rather different from the > >input file. E

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 11:52 AM 3/23/2001 -0800, Hong Zhang wrote: > > >I recommend to use 'u' flag, which indicates all operations are performed > > >against unicode grapheme/glyph. By default re is performed on codepoint. > > > > U doesn't really signal "glyph" to me, but we are sort of limited in what > > we have

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 10:48 PM 3/23/2001 +, Simon Cozens wrote: >On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > > Yes, I realize that point 5 may result in someone getting a meaningless > > Unicode string. Too bad--it is *not* the place of a programming > language to > > enforce validity on dat

Re: Unicode handling

2001-03-23 Thread Bryan C. Warnock
On Friday 23 March 2001 14:48, you wrote > In Unicode, there's theoretically no locale. Theoretically... Well, yes, but Unicode makes no pretenses about encoding the world's languages - just the various symbols use by the world's languages. If you want to orient Perl so that it remains(?) data-

RE: Unicode handling

2001-03-23 Thread NeonEdge
Dan Sugalski wrote: >If we do, then something as simple as this: > > while () { > $count++ if /bar/; > print OUT $_; > } > >would potentially result in the output file being rather different from the >input file. Equivalent, yes, but different. Whether that's bad or not is an >

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 02:31 PM 3/23/2001 -0500, Bryan C. Warnock wrote: >On Friday 23 March 2001 14:18, Dan Sugalski wrote: > > At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote: > > > > 6) There will be a glyph boundary/non-glyph boundary pair of regex > > > > characters to match the word/non-word boundary ones we alre

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote: > > 6) There will be a glyph boundary/non-glyph boundary pair of regex > > characters to match the word/non-word boundary ones we already have. >(While > > I'd personally like \g and \G, that won't work as \G is already taken) > > > > I also realize t

RE: Unicode handling

2001-03-23 Thread Garrett Goebel
From: Nicholas Clark [mailto:[EMAIL PROTECTED]] > > On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > > 1) All Unicode data perl does regular expressions against > >will be in Normalization Form C, except for... > > 2) Regexes tagged to run against a decomposed form will > >

Re: Unicode handling

2001-03-23 Thread Bryan C. Warnock
On Friday 23 March 2001 14:18, Dan Sugalski wrote: > At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote: > > > 6) There will be a glyph boundary/non-glyph boundary pair of regex > > > characters to match the word/non-word boundary ones we already have. > > > >(While > > > > > I'd personally like \g and

Re: Unicode handling

2001-03-23 Thread Uri Guttman
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> U doesn't really signal "glyph" to me, but we are sort of limited DS> in what we have left. We still need a zero-width assertion for DS> glyph boundary within regexes themselves. how about \C? it doesn't seem to be taken and would

Re: Unicode handling

2001-03-23 Thread Simon Cozens
On Fri, Mar 23, 2001 at 03:15:41PM -0800, Brad Hughes wrote: > Simon Cozens wrote: > [...] > > I'm just not sure it's fair on Old World hackers. Will there be a way to stop > > Perl upgrading stuff to Unicode on the way in? > > and I'm probably not the only Old World hacker that would > prefe

Re: Unicode handling

2001-03-23 Thread Damien Neil
On Fri, Mar 23, 2001 at 06:16:58PM -0500, Dan Sugalski wrote: > At 11:09 PM 3/23/2001 +, Simon Cozens wrote: > >For instance, chr() will produce Unicode codepoints. But you can pretend that > >they're ASCII codepoints, it's only the EBCDIC folk that'll get hurt. I hope > >and suspect there'll

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 10:56 AM 3/23/2001 -0800, Damien Neil wrote: >On Fri, Mar 23, 2001 at 12:38:04PM -0500, Dan Sugalski wrote: > >while () { > > $count++ if /bar/; > > print OUT $_; > >} > >I would find it surprising for this to have different output >than input. Other people's milage m

Re: Unicode handling

2001-03-23 Thread Damien Neil
On Fri, Mar 23, 2001 at 06:31:13PM -0500, Dan Sugalski wrote: > >Err, perhaps I'm being dumb here - but surely $foo and $bar arent > >typed strings, they're just numbers (or strings which match /^\d+$/) ??? > > D'oh! Too much blood in my caffeine stream. Yeah, I was thinking of ord. > > chr will

Re: Unicode handling

2001-03-23 Thread Hong Zhang
> >I recommend to use 'u' flag, which indicates all operations are performed > >against unicode grapheme/glyph. By default re is performed on codepoint. > > U doesn't really signal "glyph" to me, but we are sort of limited in what > we have left. We still need a zero-width assertion for glyph boun

Re: Distributive -> and indirect slices

2001-03-23 Thread Rick Welykochy
Simon Cozens wrote: > > On Mon, Mar 19, 2001 at 08:30:31AM -0800, Peter Scott wrote: > > Seen http://dev.perl.org/rfc/82.pod? > > I hadn't. I'm surprised it didn't give the PDL people screaming fits. > But no, I wouldn't do it like that. It has: > > @b = (1,2,3); > @c = (2,4,6); > @d = @b *

PDD for coding conventions

2001-03-23 Thread Dave Mitchell
About a month ago I started working on a PDD for how code should be commented; some while later Paolo Molaro <[EMAIL PROTECTED]> sumitted a draft PDD ('PDD X') on "Perl API conventions". This gave me to think that, rather than accumulating lots of micro PDDs, we should have a single one entitled

Re: Unicode handling

2001-03-23 Thread Damien Neil
On Fri, Mar 23, 2001 at 12:38:04PM -0500, Dan Sugalski wrote: >while () { > $count++ if /bar/; > print OUT $_; >} I would find it surprising for this to have different output than input. Other people's milage may vary. In general, however, I think I would prefer to be

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 11:09 PM 3/23/2001 +, Simon Cozens wrote: >For instance, chr() will produce Unicode codepoints. But you can pretend that >they're ASCII codepoints, it's only the EBCDIC folk that'll get hurt. I hope >and suspect there'll be an equivalent of "use bytes" which makes chr(256) >either blow up o

Re: Safe signals and perl 6

2001-03-23 Thread Uri Guttman
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> Generally speaking, signals will be treated as generic async events in perl DS> 6, since that's what they are. (The ones that aren't, like SIGBUS, really DS> aren't things that perl code can catch...) They're going to be treated

Re: Unicode handling

2001-03-23 Thread Dave Mitchell
Dan Sugalski <[EMAIL PROTECTED]> doodled: > At 11:09 PM 3/23/2001 +, Simon Cozens wrote: > >For instance, chr() will produce Unicode codepoints. But you can pretend that > >they're ASCII codepoints, it's only the EBCDIC folk that'll get hurt. I hope > >and suspect there'll be an equivalent of

Re: Schwartzian Transform

2001-03-23 Thread Mark Koopman
i have to put my 2 cents in... after reading all the discussion so far about the Schwartz, i feel that map{} sort map{} is perfect in it's syntax. if you code and understand Perl (i've seen situations where these aren't always both happening at the time) and knowingly use the building block fun

Re: Unicode handling

2001-03-23 Thread Simon Cozens
On Fri, Mar 23, 2001 at 05:56:19PM -0500, Dan Sugalski wrote: > Nah, they only apply to data that perl's tagged as Unicode, either because > its input stream is marked that way or because the program explicitly > converted the data. Oh, colour me dull. I read 4) Data converted to Unicode (

Re: Unicode handling

2001-03-23 Thread Nicholas Clark
On Fri, Mar 23, 2001 at 03:08:35PM -0500, Dan Sugalski wrote: > I'm half tempted, since this is a Unicode-only feature, to use a non-ASCII > character. > > \SMILEY FACE, perhaps? that makes it kind of hard to edit perl scripts that use this feature on any good old fashioned 8 bit xterm. Let alo

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 11:41 PM 3/22/2001 +, Nicholas Clark wrote: >On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > > 1) All Unicode data perl does regular expressions against will be in > > Normalization Form C, except for... > > 2) Regexes tagged to run against a decomposed form will instead be

Re: Distributive -> and indirect slices

2001-03-23 Thread Simon Cozens
On Mon, Mar 19, 2001 at 08:30:31AM -0800, Peter Scott wrote: > Seen http://dev.perl.org/rfc/82.pod? I hadn't. I'm surprised it didn't give the PDL people screaming fits. But no, I wouldn't do it like that. It has: @b = (1,2,3); @c = (2,4,6); @d = @b * @c; # Returns (2,8,18) Where I would h

Re: Unicode handling

2001-03-23 Thread Larry Wall
Jarkko Hietaniemi writes: : *cough* \C *is* taken. : : > >also \U has a meaning in double quotish strings. : : "\Uindeed." Bear in mind we are redesigning the language. If there's a botch we can think about fixing it. Though maybe not on -internals... :-) Larry

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote: > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: > > DS> U doesn't really signal "glyph" to me, but we are sort of limited > DS> in what we have left. We still need a zero-width assertion for > DS> glyph boundary within regexes themselv

Safe signals and perl 6

2001-03-23 Thread Dan Sugalski
Generally speaking, signals will be treated as generic async events in perl 6, since that's what they are. (The ones that aren't, like SIGBUS, really aren't things that perl code can catch...) They're going to be treated pretty much like any other event, or so the plan is at least. Uri's worki

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 02:06 PM 3/23/2001 -0600, Jarkko Hietaniemi wrote: >On Fri, Mar 23, 2001 at 02:50:05PM -0500, Dan Sugalski wrote: > > At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote: > > > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: > > > > > > DS> U doesn't really signal "glyph" to me, but we are

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 08:14 PM 3/23/2001 +, Nicholas Clark wrote: >On Fri, Mar 23, 2001 at 03:08:35PM -0500, Dan Sugalski wrote: > > I'm half tempted, since this is a Unicode-only feature, to use a non-ASCII > > character. > > > > \SMILEY FACE, perhaps? > >that makes it kind of hard to edit perl scripts that use

Re: Unicode handling

2001-03-23 Thread Hong Zhang
> > >We need the character equivalence construct, such as [[=a=]], which > > >matches "a", "A ACUTE". > > > > Yeah, we really need a big list of these. PDD anyone? > > > > But surely this is a locale issue, and not an encoding one? Not every > language recognizes the same character equivalences

Re: Distributive -> and indirect slices

2001-03-23 Thread John Porter
Simon Cozens wrote: > Better is to solve the general problem, and have all > operators overloadable even on non-objects, so the user > can define how this sort of thing works. Even better is to let the user have access to the real objects by which "non-objects", i.e. normal variables, are impleme

Re: Unicode handling

2001-03-23 Thread Simon Cozens
On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > Yes, I realize that point 5 may result in someone getting a meaningless > Unicode string. Too bad--it is *not* the place of a programming language to > enforce validity on data. That's the programmer's job. But points 4 and 5 do en

Re: Unicode handling

2001-03-23 Thread Dan Sugalski
At 01:07 PM 3/23/2001 -0800, Larry Wall wrote: >Jarkko Hietaniemi writes: >: *cough* \C *is* taken. >: >: > >also \U has a meaning in double quotish strings. >: >: "\Uindeed." > >Bear in mind we are redesigning the language. If there's a botch we >can think about fixing it. > >Though maybe not on