Jarkko Hietaniemi:
About the implementation of character classes: since the Unicode code
point range is big, a single big bitmap won't work any more: firstly,
it would be big. Secondly, for most cases, it would be wastefully
sparse. A balanced binary tree of (begin, end) points of ranges is
sug
Reading this in Apoc 4
sub mywhile ($keyword, &condition, &block) {
my $l = $keyword.label;
while (&condition()) {
&block();
CATCH {
my $t = $!.tag;
when X::Control::next { die if $t && $t ne $l); next }
w
Thanks, Jarrko.
On Thursday 17 January 2002 23:21, Jarkko Hietaniemi wrote:
> The most important message is that give up on 8-bit bytes, already.
> Time to move on, chop chop.
Do you think/feel/wish/demand that the textual (string) APIs should differ
from the binary (byte) APIs? (Both from an
On Fri, Jan 18, 2002 at 04:51:07AM -0500, Bryan C. Warnock wrote:
> Thanks, Jarrko.
>
> On Thursday 17 January 2002 23:21, Jarkko Hietaniemi wrote:
> > The most important message is that give up on 8-bit bytes, already.
> > Time to move on, chop chop.
>
> Do you think/feel/wish/demand that the t
> Since I seem to be the main regex hacker for Parrot, I'll respond to
> this as best I can.
>
> Currently, we are using bitmaps for character classes. Well, sort of.
> A Bitmap in Parrot is defined like this:
>
> typedef struct bitmap_t {
> char* bmp;
>
Michael G Schwern <[EMAIL PROTECTED]> writes:
> Reading this in Apoc 4
>
> sub mywhile ($keyword, &condition, &block) {
> my $l = $keyword.label;
> while (&condition()) {
> &block();
> CATCH {
> my $t = $!.tag;
> when X::
Michael G Schwern wrote:
> Reading this in Apoc 4 ...
I looked on http://dev.perl.org/perl6/apocalypse/: no sign of Apoc4. Where
do I find this latest installment?
Dave.
http://www.perl.com/pub/a/2002/01/15/apo4.html
David Whipp wrote:
>
> Michael G Schwern wrote:
>
> > Reading this in Apoc 4 ...
>
> I looked on http://dev.perl.org/perl6/apocalypse/: no sign of Apoc4. Where
> do I find this latest installment?
>
> Dave.
>Michael G Schwern wrote:
>
>> Reading this in Apoc 4 ...
>
>I looked on http://dev.perl.org/perl6/apocalypse/: no sign of Apoc4. Where
>do I find this latest installment?
www.perl.com. dev.perl.org must just not have a link yet.
--
Dan
---
> (1) There are 5.125 bytes in Unicode, not four.
> (2) I think the above would suffer from the same problem as one common
> suggestion, two-level bitmaps (though I think the above would suffer
> less, being of finer granularity): the problem is that a lot of
> space is wasted, since t
> I don't think UTF-32 will save you much. The unicode case map is variable
> length, combining character, canonical equivalence, and many other thing
> will require variable length mapping. For example, if I only want to
This is true.
> parse /[0-9]+/, why you want to convert everything to UTF-
On Fri, Jan 18, 2002 at 11:44:00AM -0800, Hong Zhang wrote:
> > (1) There are 5.125 bytes in Unicode, not four.
> > (2) I think the above would suffer from the same problem as one common
> > suggestion, two-level bitmaps (though I think the above would suffer
> > less, being of finer granu
> > preprocessing. Another example, if I want to search for /resume/e,
> > (equivalent matching), the regex engine can normalize the case, fully
> > decompose input string, strip off any combining character, and do 8-bit
>
> Hmmm. The above sounds complicated not quite what I had in mind
> for
> > My proposal is we should use mix method. The Unicode standard class,
> > such as \p{IsLu}, can be handled by a standard splitbin table. Please
> > see Java java.lang.Character or Python unicodedata_db.h. I did
> > measurement on it, to handle all unicode category, simple casing,
> > and decim
Did u passed "Bermuda Triangle" :")
raptor
At 10:16 AM +0200 1/18/02, raptor wrote:
>Did u passed "Bermuda Triangle" :")
It may be a bit before Ex4 is done. Damian's on a cruise ship at the
moment, so even if he's got the time (and I don't think he does) he's
likely lacking connectivity. I expect he'll give us word at some
point what t
On Fri, Jan 18, 2002 at 12:20:53PM -0800, Hong Zhang wrote:
> > > My proposal is we should use mix method. The Unicode standard class,
> > > such as \p{IsLu}, can be handled by a standard splitbin table. Please
> > > see Java java.lang.Character or Python unicodedata_db.h. I did
> > > measurement
On Fri, Jan 18, 2002 at 03:35:59PM -0500, Dan Sugalski wrote:
> At 10:16 AM +0200 1/18/02, raptor wrote:
> >Did u passed "Bermuda Triangle" :")
>
> It may be a bit before Ex4 is done. Damian's on a cruise ship at the
> moment, so even if he's got the time (and I don't think he does) he's
> like
At 4:17 PM -0500 1/18/02, Michael G Schwern wrote:
>On Fri, Jan 18, 2002 at 03:35:59PM -0500, Dan Sugalski wrote:
>> At 10:16 AM +0200 1/18/02, raptor wrote:
>> >Did u passed "Bermuda Triangle" :")
>>
>> It may be a bit before Ex4 is done. Damian's on a cruise ship at the
>> moment, so even if
Apo4, when introducing POST, mentions that there is a
corresponding "PRE" block "for design-by-contract
programmers".
However, I see the POST block being used as a finalize;
and thus allowing (encouraging?) it to have side effects.
I can't help feeling that contract/assertion checking
should not
On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote:
> ints, or 176 bytes. Searching for membership in an inversion list is
> O(N log N) (binary search). "Encoding the whole range" is a non-issue
> bordering on a joke: two ints, or 8 bytes.
[Clarification from a noncombatant] You m
On Fri, Jan 18, 2002 at 01:40:26PM -0800, Steve Fink wrote:
> On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote:
> > ints, or 176 bytes. Searching for membership in an inversion list is
> > O(N log N) (binary search). "Encoding the whole range" is a non-issue
> > bordering on a jo
From: David Whipp [mailto:[EMAIL PROTECTED]]
>
> Apo4, when introducing POST, mentions that there is a
> corresponding "PRE" block "for design-by-contract
> programmers".
>
> However, I see the POST block being used as a finalize;
> and thus allowing (encouraging?) it to have side effects.
It m
At 3:37 PM + 1/18/02, Piers Cawley wrote:
>Michael G Schwern <[EMAIL PROTECTED]> writes:
>
>Hmm... making up some syntax on the fly. I sort of like the idea of
>being able to do
>
> class File;
> sub foreach ($file, &block) is Control {
> # 'is Control' declares this as a contr
On Fri, Jan 18, 2002 at 01:40:26PM -0800, Steve Fink wrote:
> On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote:
> > ints, or 176 bytes. Searching for membership in an inversion list is
> > O(N log N) (binary search). "Encoding the whole range" is a non-issue
> > bordering on a jo
On Sat, Jan 19, 2002 at 12:11:06AM +0200, Jarkko Hietaniemi wrote:
> Complement of an inversion list is neat: insert 0 at the beginning
> (and append max+1), unless there already is one, in which case delete
> the 0 (and shift the list and delete the max+1). Again, O(N).
> (One could of course h
On Fri, Jan 18, 2002 at 02:22:49PM -0800, Steve Fink wrote:
> On Sat, Jan 19, 2002 at 12:11:06AM +0200, Jarkko Hietaniemi wrote:
> > Complement of an inversion list is neat: insert 0 at the beginning
> > (and append max+1), unless there already is one, in which case delete
> > the 0 (and shift the
On Sat, Jan 19, 2002 at 12:28:15AM +0200, Jarkko Hietaniemi wrote:
> On Fri, Jan 18, 2002 at 02:22:49PM -0800, Steve Fink wrote:
> > On Sat, Jan 19, 2002 at 12:11:06AM +0200, Jarkko Hietaniemi wrote:
> > > Complement of an inversion list is neat: insert 0 at the beginning
> > > (and append max+1),
> > > We *do* want to have (with some notation)
> > > [[:digit:]\p{FunkyLooking}aeiou except 7], right?
> >
> > Of course. But that is all resolvable in regex compile time.
> > No expression tree needed.
>
> My point was that if inversion lists are insufficient for describing
> all the characte
> [concerns over conflation of post-processing and post-assertions]
Having read A4 thoroughly, twice, this was my only real concern
(which contrasted with an overall sense of "wow, this is so cool").
--me
At 12:51 PM -0500 1/15/02, Andy Dougherty wrote:
>I think the optimal fix here is simply to remove -ansi -pedantic.
>-ansi may well have some uses, but even the gcc man pages say
>"There is no reason to use this option [-pedantic]; it exists only
>to satisfy pedants."
Applied. thanks. (Though I h
At 9:30 AM -0800 1/15/02, Steve Fink wrote:
>This patch add docs/running.pod, which lists the various executables
>Parrot currently includes, examples of running them, and mentions of
>where they fail to work. It's more of a cry for help than a useful
>reference. :-) I've been having trouble recen
Dan Sugalski <[EMAIL PROTECTED]> writes:
> At 3:37 PM + 1/18/02, Piers Cawley wrote:
>>Michael G Schwern <[EMAIL PROTECTED]> writes:
>>
>>Hmm... making up some syntax on the fly. I sort of like the idea of
>>being able to do
>>
>> class File;
>> sub foreach ($file, &block) is Control {
A thought occurred to me a few days ago:
If I remember correctly, attempts to benchmark parrot's developing regular
expressions against perl's regular expressions are proving "disappointing".
However, perl5 has the advantage of a regular expression optimiser as I
understand it, or at least code t
On Fri, Jan 18, 2002 at 05:24:00PM +0200, Jarkko Hietaniemi wrote:
> > As for character encodings, we're forcing everything to UTF-32 in
> > regular expressions. No exceptions. If you use a string in a regex,
> > it'll be transcoded. I honestly can't think of a better way to
> > guarantee effi
"Me" <[EMAIL PROTECTED]> writes:
>> [concerns over conflation of post-processing and post-assertions]
>
> Having read A4 thoroughly, twice, this was my only real concern
> (which contrasted with an overall sense of "wow, this is so cool").
I think that people have sort of got used to the fact th
Okay boys and girls, what does this print:
my @aaa = qw/1 2 3/;
my @bbb = @aaa;
try {
print "$_\n";
}
for @aaa; @bbb -> my $a; my $b {
print "$a:$b";
}
I'm guessing one of:
1:1
2:2
3:3
or a syntax error, complaining about something near
C<@bbb -> my $a ; my $b {>
In other words, how
On Fri, Jan 18, 2002 at 11:40:17PM +, Nicholas Clark wrote:
> On Fri, Jan 18, 2002 at 05:24:00PM +0200, Jarkko Hietaniemi wrote:
>
> > > As for character encodings, we're forcing everything to UTF-32 in
> > > regular expressions. No exceptions. If you use a string in a regex,
> > > it'll be
That particular example is flawed, because the try expression is turned
into a try statement because the } stands alone on its line.
But if you eliminate a couple newlines between } and for, then your
question makes sense (but the code is not well structured, but hey, maybe
you take out all the n
Me wrote:
> > [concerns over conflation of post-processing and post-assertions]
>
> Having read A4 thoroughly, twice, this was my only real concern
> (which contrasted with an overall sense of "wow, this is so cool").
>
> --me
Yes, very, very cool.
I especially liked how RFC 88 was "accepted wi
Michael G Schwern writes:
: Reading this in Apoc 4
:
: sub mywhile ($keyword, &condition, &block) {
: my $l = $keyword.label;
: while (&condition()) {
: &block();
: CATCH {
: my $t = $!.tag;
: when X::Control::next { die
Piers Cawley writes:
: Hmm... making up some syntax on the fly. I sort of like the idea of
: being able to do
:
: class File;
: sub foreach ($file, &block) is Control {
: # 'is Control' declares this as a control sub, which, amongst
: # other things 'hides' itself from cal
Anyone have any objection to adding a couple of calls to terminate
and/or return null terminated strings from Parrot strings for places
where an API expects a standard C string?
I'm not sure of the preferred way to handle this. It would be nice to
at least try to terminate the current string buff
Larry Wall <[EMAIL PROTECTED]> writes:
> Michael G Schwern writes:
> : Reading this in Apoc 4
> :
> : sub mywhile ($keyword, &condition, &block) {
> : my $l = $keyword.label;
> : while (&condition()) {
> : &block();
> : CATCH {
> : my $
[reformatting response for readability and giving Glenn a stiff talking
to]
Glenn Linderman <[EMAIL PROTECTED]> writes:
> Piers Cawley wrote:
>
>> Okay boys and girls, what does this print:
>>
>> my @aaa = qw/1 2 3/;
>> my @bbb = @aaa;
>>
>> try {
>> print "$_\n";
>> }
>>
>> for @aaa; @bbb ->
Hong Zhang <[EMAIL PROTECTED]> writes:
>> > preprocessing. Another example, if I want to search for /resume/e,
>> > (equivalent matching), the regex engine can normalize the case, fully
>> > decompose input string, strip off any combining character, and do 8-bit
>>
>> Hmmm. The above sounds co
46 matches
Mail list logo