Perl 6 summary for week ending 2002-09-29

2002-10-01 Thread Piers Cawley

The Perl 6 Summary for the Week Ending 20020929
Okay, this is my last summary before I take a couple of week's holiday
away from any form of connectivity. Will I cope? Can my system stand
going cold turkey? Can you live without my summaries?

Luckily, Leon Brocard has been volunteered to step into the breach and
produce summaries for the next couple of weeks.

Oh yes, due to my being a lazy swine and not reading release notes,
combined with a new version of Spamassassin no longer delivering mail by
default (now it silently drops mail on the floor in cases where it had
previously just delivered the mail), I may be missing some messages from
this week. Sorry.

We'll kick off, as usual with happenings on the internal list:

  Of Variables, Values and Vtables
Dan stopped travelling (for a while at least), and listed the current
short term goals for Parrot. They are:

* Finish up the calling convention changes
* Spec the PMC changes
* Spec the vtable changes
* Get exceptions fully defined and a preliminary implementation

and promised the variable/vtable stuff in the `next day or so', with the
calling convention stuff a little earlier or later. Leo Toetsch offered
some his thoughts on vtable methods for _keyed opcodes.

http://makeashorterlink.com/?Z31D146F1

http://makeashorterlink.com/?L22D226F1

  IMCC 0.0.9.2
Leopold Toetsch provided a patch which `fixes all currently known
problems [with respect to] IMCC/Perl6'. Andy Dougherty had some problems
with the patch dumping core, possibly because of platform specific
issues, and Steve Fink realised that there was an overlap between this
patch and one he'd been working on. The patch has not yet been applied,
but work continued.

http://makeashorterlink.com/?L23D326F1

Fun with intlists
Leopold Toetsch showed some benchmarks of intlist against PerlArrays,
the difference is stunning. The intlist based test is some ten times
faster than PerlArray, with most of PerlArray's time being spent
allocating memory. Leo suggests using intlist as the PerlArray base
class.

Having got bragging rights for one speed up, Leo sent in a second patch
which gave *another* ten fold performance boost. Sean O'Rourke had a few
questions about performance in typical usage and wondered if, we
shouldn't look at using borrowing from SGI's STL implementation of a
dequeue (double ended queue). Leo was ahead of him there; his second
patch was already using the trick Sean had suggested.

http://makeashorterlink.com/?E34D126F1

http://makeashorterlink.com/?L25D516F1

  Functions in Scheme
Jürgen B"ouml"mmels sent a pre patch which gets Scheme functions
working. It's built on top of an early version of Sean O'Rourke's
scratchpad.pmc, so be careful applying the initial patch. Sean hoped
that it would be be easy to reconcile Jürgen's changes to the scratchpad
pmc with the changes he'd made since he sent Jürgen his early code.
Jonathan Sillito asked why the scheme interpreter maintained its own
environment stack rather than use the "pad_stack". Apparently the
current pad_stack is very closely tied to Sub.pmc, which doesn't quite
offer the semantics needed for scheme functions. Also, the pad_stack
makes it tricky to implement "set!" and "define" correctly.

Dan chimed in asking everyone to hash out what they needed from
scratchpads and lexical variables; once we have that nailed down it
should be easy to get everything designed and implemented reasonably
quickly, so Jürgen and Sean came up with a list between them.

http://makeashorterlink.com/?O16D116F1 -- The patch

http://makeashorterlink.com/?B27D126F1 -- Its description

  Perl6 on HP-UX 11.00
H Merijn Brand was having trouble getting Perl 6 to work on HP-UX. It
was initially thought that this was a problem with the version of perl
he was using, but was eventually tracked down to a problem with "make
test"; the tests passed when Merijn did "perl6 --test". However the
thread also covered making sure that the Perl6 build process rebuilt the
Grammar if appropriate. There's also a theory that there's a problem
with IMCC generating .pasm files.

Leopold Toetsch put his hand up for causing the problem, and submitted a
patch to fix things. Applied.

http://makeashorterlink.com/?S18D256F1

http://makeashorterlink.com/?O39D216F1

  The status of Leopold Toetsch's patches
Leo wondered what's happening with the pile of patches he's submitted
this week. At the time he made the post, he had 15 patches outstanding
(or is that `outstanding patches'?) and, as a result several of the
patches were applied. Steve Fink voted that Leo should be given commit
access to CVS and Leo was grateful for the vote of confidence.

Leo later sent in yet another patch for intlist, whic

Re: Interfaces

2002-10-01 Thread Michael Lazzaro


On Monday, September 30, 2002, at 05:23  PM, Michael G Schwern wrote:

> OTOH, Java interfaces have a loophole which is considered a design 
> mistake.
> An interface can declare some parts of the interface optional and then
> implementors can decide if they want to implement it or not.  The 
> upshot
> being that if you use a subclass in Java you can't rely on the optional
> parts being there.
>
> This comes down to an OO philosophy issue.  If Perl 6 wants a strict OO
> style, don't put in a loophole.  If they want to leave some room to 
> play,
> put in the ability to turn some of the strictness off.

I guess what bothers me is the loophole issue, sort of... in specific, 
who gets to decide whether a given interface method is optional.  I'm 
hoping that the optional-ness of an interface is itself optional.  And 
that the optional-ness of a non-optional interface is not optional.  :-)

The problems arise anywhere you're referring to interface methods from 
outside the class.  It may be perfectly OK for a subclass of a given 
class to reuse an interface method for a different purpose, etc., so 
long as nothing outside the inheritance chain is using it... but I'd 
very much like a way for an interface to say "don't muck with me", so 
that sensitive interfaces can be guaranteed as forever invariant, if 
you really, really mean it.

My hope is that you could define the "non-overridability" of a given 
interface method in the parent class, to guarantee enforcement among 
subclasses.  Possible pseudocode choices include:

# (1) explicitly not an interface method
method foo (...) is private { ... }

# (2) explicitly an interface method
method foo (...) is interface { ... }

# (3) explicitly a "strict" interface, can't change it in subclasses
method foo (...) is strict interface { ... }

# (4) explicitly "optional", subclasses can muck with it
method foo (...) is optional interface { ... }

# (5) So what's the unattributed case?  An implied interface,
# implied optional interface, or just explicitly not private?
method foo (...) { ... }

I'd say if we had (1), and if (2) enforced "strict", then (5) could 
mean "optional, not private, but not strictly an interface either...", 
and we wouldn't need the icky (3) or (4) at all.  That would go nicely 
with expectations, I think.  (But are there other possibilities besides 
"private" and "interface", e.g. "protected", etc.?)

On Monday, September 30, 2002, at 06:04  PM, David Whipp wrote:
>>> What if a subclass adds extra, optional arguments to a
>>> method, is that ok?
>>
>> ... In theory, yes...
>
> I don't think that the addition of an optional parameter
> violates any substitution principle: users of the base-class
> interface couldn't use the extra params (because they're not in
> the interface); but a user of the derived-class's interface
> can use the extra power (because they are in that interface).
> A derived class is always allowed to add things (thus, you can
> weaken preconditions, stengthen postconditions, add extra
> methods, return a more specific result, ...; but you can't
> strengthen a precondtion, nor weaken a postcondition, etc.)

Agreed.  If we had some concept like "strict" vs. "overridable" 
interfaces, should "strict" prevent this, too, or are extra, optional 
parameters always allowed as a special case (under the assumption that 
they can't hurt anything that doesn't know about them?)


>> if our interface "a" is returning an object, of a class that flattens 
>> itself
>> differently in different contexts, then do we say the interface can
>> only return object classes derived from that first object class?
>> And do we restrict the possible "flattenings" of the object class 
>> itself,
>> using an interface, so subclasses of the returned obj can't muck with
>> it and unintentionally violate our first interface ("a")?...

My musing is that the behavior of a class in different contexts is 
itself an interface, in the sense of being a contract between a 
class/subclass and it's users, and therefore could be expressible in 
the same syntax as other interfaces, with the same issues of  
(non)strictness.  (And for that matter, the contextual return values of 
any arbitrary function, foo(), can be expressed as an interface to a 
one-off class named "foo", which implies large commonalities of 
syntax/implementation between classes, methods, operators, and ordinary 
functions [1]... hee hee...)

Mike Lazzaro
Cognitivity (http://www.cognitivity.com/)


[1] ...which implies that all Perl6 operators and functions can (must?) 
be implemented as flyweight classes with interfaces that define their 
properties/attributes/arguments/contexts, but that's a Perl6 <--> 
Parrot thing.




exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread esp5

I was wondering what the favored syntax in perl6 would be to match negative
multi-byte strings. In perl 5:

$sql = "select * from a where b union select * from c where d";

my $nonunion = "[^u]|u[^n]|un[^i]|uni[^o]|unio[^n]";
my (@subsqls) = ($sql =~ m"((?:$nonunion)*");

guaranteeing that the subsqls have all text up to, but not including the string
"union".

I suppose I could say:

rule nonunion { (.*) :: { fail if ($1 =~ m"union$"); } }

although that seems awful slow, and I suppose I that I could do the same thing
in perl6 as I did in perl5, although that gets ugly if you need to combine 
matching strings without "union" in them with, say parens:

rule parens {   \* [ <-[()]> + : |  ]*  \) }
rule non_union_non_parens   
{
[< -[()u] > | 
u< -[()n] > | 
un   < -[()i] > | 
uni  < -[()o] > | 
unio < -[()n] > 
] 
}

my (@subsqls) = ($sql =~ m" ([  |  ]*) ");

And finally, I suppose I could write a sql grammar (which for this application,
and most) is definitely overkill. So I guess I'd like something shorter, 
something where you could say:

< -["union"] >

or 

< -["union"\(\)] >

or 

< -["union""select"\(\)] >

a generic negative, multi-byte string matching mechanism. Any thoughts? 
Am I missing something already present or otherwise obvious?

Ed



Re: Interfaces

2002-10-01 Thread Michael G Schwern

On Tue, Oct 01, 2002 at 11:51:02AM -0700, Michael Lazzaro wrote:
> >This comes down to an OO philosophy issue.  If Perl 6 wants a strict OO
> >style, don't put in a loophole.  If they want to leave some room to 
> >play,
> >put in the ability to turn some of the strictness off.
> 
> I guess what bothers me is the loophole issue, sort of... in specific, 
> who gets to decide whether a given interface method is optional.

If we do it loosely, the subclasser decides if they want to follow the
interface, since most violations of an interface are done because it's being
used in an unforseen manner.  But they have to explicitly say they're
violating it.

I can't see any good reason why an interface author would want to make their
interface optional.

If we do it strictly, interfaces are not optional.

Perhaps a way to sharpen the focus on this is to expand the discusson of
strictness to include not just method prototypes but Design-By-Contract
features as well (pre and post conditions and invariants).  Should DBC
conditions be overridable?  Since it's not terribly useful to override a
signature only to be stopped by a pre-condition.

Taken as a whole, I'm leaning towards no.  Interfaces and conditions should
be strict.  They can be gotten around using delegation, which should be
built into Perl 6 anyway.


> >I don't think that the addition of an optional parameter
> >violates any substitution principle: users of the base-class
> >interface couldn't use the extra params (because they're not in
> >the interface); but a user of the derived-class's interface
> >can use the extra power (because they are in that interface).
> >A derived class is always allowed to add things (thus, you can
> >weaken preconditions, stengthen postconditions, add extra
> >methods, return a more specific result, ...; but you can't
> >strengthen a precondtion, nor weaken a postcondition, etc.)
> 
> Agreed.  If we had some concept like "strict" vs. "overridable" 
> interfaces, should "strict" prevent this, too, or are extra, optional 
> parameters always allowed as a special case (under the assumption that 
> they can't hurt anything that doesn't know about them?)

Unless someone can come up with a practical case of adding parameters which
violates the interface, I'd say there's no problem, strict or no strict.


> My musing is that the behavior of a class in different contexts is 
> itself an interface, in the sense of being a contract between a 
> class/subclass and it's users

Ah HA!  Contract!  Return values can be enforce via a simple DBC post
condition, no need to invent a whole new return value signature.


-- 

Michael G. Schwern   <[EMAIL PROTECTED]>http://www.pobox.com/~schwern/
Perl Quality Assurance  <[EMAIL PROTECTED]> Kwalitee Is Job One
And if you don't know Which To Do
Of all the things in front of you,
Then what you'll have when you are through
Is just a mess without a clue.
Of all the best that can come true
If you know What and Which and Who.



exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread Luke Palmer


> [Negative matching]

> a generic negative, multi-byte string matching mechanism. Any thoughts? 
> Am I missing something already present or otherwise obvious?

Maybe I'm misundertanding the question, but I think you want negative
lookahead:

Perl 5:   /(.*)(?!>union)/
Perl 6:   /(.*) /

Luke



Re: Interfaces

2002-10-01 Thread Trey Harris

In a message dated Mon, 30 Sep 2002, Michael G Schwern writes:

> On Mon, Sep 30, 2002 at 06:04:28PM -0700, David Whipp wrote:
> > On a slightly different note, if we have interfaces then I'd really
> > like to follow the Eiffel model: features such as renaming methods
> > in the derived class may seem a bit strange; but they can be useful
> > if you have have name-conflicts with multiple inheritance.
>
> I'm not familiar with the Eiffel beyond "it's the DBC language and it's
> French", but wouldn't this simply be covered by aliasing?

No, because this only gives you a second name for the method, it does not
obliterate the meaning of the first.  The example Damian uses in his OO
class (when he discusses Class::Delegation) is a Car which inherits from
Vehicle (which has an action C, which causes the
car to move) and also inherits from MP3_Player (which has an accessor
C which sets which media spindle to use).

It's incorrect to use just the C method, it's incorrect
to use just the C method, it's incorrect to call both
(vehicular acceleration would cause the MP3 player to change disks).  You
need both capabilities, but you need them separately.

You want something like

  class Car is Vehicle renames(drive => accel)
is MP3_Player renames(drive => mp3_drive);

Either of those renamings is, of course, optional, in which case drive()
refers to the non-renamed one when referring to a Car object.

But later on, if some code does

  Vehicle $mover = getNext(); # returns a Car
  $mover.drive(5);

It should call C on C<$mover>, that is,
C<$mover.accel()>.

See why aliasing doesn't solve this?

It can get more complicated, too.  Say you want to do this (I don't know
if this will be possible, but it could be):

  class DoublyLinkedList is LinkedList
 is LinkedList
 renames($.head => $.tail,
 nextNode => prevNode
 Node::$.next => Node::$.prev);

What's going on here?  We're inheriting from C twice.  The
first time, we just accept it wholesale, including its inner Node class
which contains a $.data and a $.next reference.  The second time, we
rename the $.head reference to $.tail (along with its associated method
head() to tail()), we rename its nextNode() method to prevNode(), and we
rename *its* version of the inner Node class to make $.next into $.prev.
The Node::$.data attribute is still shared.

Redefine insert() and delete() so that it deals with both the next node
and the previous node, and you're done.  If some $variable of type
LinkedList was assigned a DoublyLinkedList, a call to
C<$variable.nextNode> would call the un-redefined nextNode, which is
correct.  If you redefined *both* Cs, you'd probably call the
first one in that situation.  But who knows

Supporting repeated inheritance and multiple inheritance with partial
redefinition opens a huge ball of wax as far as complicated inheritance
rules that I don't know Damian or Larry have any interest in fleshing out,
but it could be made to work.  Try *that* with aliasing.

Trey




Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread esp5

On Tue, Oct 01, 2002 at 01:24:45PM -0600, Luke Palmer wrote:
> 
> > [Negative matching]
> 
> > a generic negative, multi-byte string matching mechanism. Any thoughts? 
> > Am I missing something already present or otherwise obvious?
> 
> Maybe I'm misundertanding the question, but I think you want negative
> lookahead:
> 
> Perl 5:   /(.*)(?!>union)/
> Perl 6:   /(.*) /
> 
> Luke

no, that doesn't work, because of the way regexes operate. The '.*' captures 
everything, and since the string after everything (ie: the end of the string)
doesn't match 'union', the regex succeeds without backtracking. Try it:

perl -e ' $a = "this has the string union in it"; my ($b) = ($a =~ m"(.*)(?!>union)"); 
print $b;'

prints: 

this has the string union in it

not 'this has the string'.

Ed




Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread Jonathan Scott Duff

On Tue, Oct 01, 2002 at 12:47:24PM -0700, [EMAIL PROTECTED] wrote:
> On Tue, Oct 01, 2002 at 01:24:45PM -0600, Luke Palmer wrote:
> > 
> > > [Negative matching]
> > 
> > > a generic negative, multi-byte string matching mechanism. Any thoughts? 
> > > Am I missing something already present or otherwise obvious?
> > 
> > Maybe I'm misundertanding the question, but I think you want negative
> > lookahead:
> > 
> > Perl 5:   /(.*)(?!>union)/
> > Perl 6:   /(.*) /
> > 
> > Luke
> 
> no, that doesn't work, because of the way regexes operate. The '.*' captures 
> everything, and since the string after everything (ie: the end of the string)
> doesn't match 'union', the regex succeeds without backtracking. Try it:

I think what you want is just a negated assertion:

/+/

Although I don't know what that means exactly.  Does it match 5
characters at a time that aren't "union" or does it match one
character at a time as long as the string "union" isn't matched at
that point?

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]



Re: Interfaces

2002-10-01 Thread Michael G Schwern

On Tue, Oct 01, 2002 at 03:43:22PM -0400, Trey Harris wrote:
> You want something like
> 
>   class Car is Vehicle renames(drive => accel)
> is MP3_Player renames(drive => mp3_drive);
> 
> Either of those renamings is, of course, optional, in which case drive()
> refers to the non-renamed one when referring to a Car object.
> 
> But later on, if some code does
> 
>   Vehicle $mover = getNext(); # returns a Car
>   $mover.drive(5);
> 
> It should call C on C<$mover>, that is,
> C<$mover.accel()>.
> 
> See why aliasing doesn't solve this?

Ahh, because Perl has to know that when $mover is used as a Vehicle it
uses Car.accel but when used as an MP3_Player it calls Car.mp3_drive.
Clever!


-- 

Michael G. Schwern   <[EMAIL PROTECTED]>http://www.pobox.com/~schwern/
Perl Quality Assurance  <[EMAIL PROTECTED]> Kwalitee Is Job One
If I got something to say, I'll say it with lead.
-- Jon Wayne



Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread Simon Cozens

[EMAIL PROTECTED] (Jonathan Scott Duff) writes:
> I think what you want is just a negated assertion:
> 
>   /+/
> 
> Although I don't know what that means exactly. 

That matches more than one thing that is not the string "union".
"u" is not the string "union"; "n" is not the string "union"...

I think /(.*)  / may do it.

-- 
There is no distinction between any AI program and some existent game.



Re: Interfaces

2002-10-01 Thread Michael Lazzaro


On Tuesday, October 1, 2002, at 12:33  PM, Michael G Schwern wrote:
> Perhaps a way to sharpen the focus on this is to expand the discusson 
> of
> strictness to include not just method prototypes but Design-By-Contract
> features as well (pre and post conditions and invariants).  Should DBC
> conditions be overridable?  Since it's not terribly useful to override 
> a
> signature only to be stopped by a pre-condition.
>
> Taken as a whole, I'm leaning towards no.  Interfaces and conditions 
> should
> be strict.  They can be gotten around using delegation, which should be
> built into Perl 6 anyway.

I'd think no, too... if someone doesn't want or need interfaces, they 
can just not use them.  Which implies, I assume, that "interface" is 
not the default state of a class method, e.g. we do need something like 
"method foo() is interface { ... }" to declare any given method 
specifically as an interface method, if noone has a problem with that.  
Just to be clear, I'm not thinking we can get away with saying "all 
nonprivate methods are automatically interfaces", for example.

>> My musing is that the behavior of a class in different contexts is
>> itself an interface, in the sense of being a contract between a
>> class/subclass and it's users
>
> Ah HA!  Contract!  Return values can be enforce via a simple DBC post
> condition, no need to invent a whole new return value signature.

I think I get it, but can you give some pseudocode? If you want a 
method to return a list of Zoo animals in "list" context, and a Zoo 
object in "Zoo object" context, what would that look like?

(I'm assuming that DBC postconditions on a method would be treated, 
internally, as part of the overall signature/prototype of the method: 
i.e. if you override the method in a subclass, all original 
postconditions would still remain attached to it (though the new method 
might itself add additional postconditions.))

MikeL




Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread Mike Lambert

> guaranteeing that the subsqls have all text up to, but not including the string
> "union".
>
> I suppose I could say:
>
>   rule nonunion { (.*) :: { fail if ($1 =~ m"union$"); } }

What's wrong with: ?

rule getstuffbeforeunion { (.*?) union | (.*) }

"a union" => "a "
"b" => "b"

Am I missing something here?

Mike Lambert





Re: Interfaces

2002-10-01 Thread Michael Lazzaro


> On Tue, Oct 01, 2002 at 03:43:22PM -0400, Trey Harris wrote:
>> You want something like
>>
>>   class Car is Vehicle renames(drive => accel)
>> is MP3_Player renames(drive => mp3_drive);

I *really* like this, but would the above be better coded as:

class Car is Vehicle renames(drive => accel)
has MP3_Player renames(drive => mp3_drive);

 implying a "container" relationship with automatic delegation?  
Among the other considerations is that if you simply said

class Car is Vehicle has MP3_Player;

the inheritance chain could assume that Car.drive === Vehicle.drive, 
because is-a (inheritance) beats has-a (containment or delegation).  If 
you needed to, you should still be able to call $mycar.MP3_Player.drive 
to DWYM, too.

Along these lines, I'd love the ability to do something like:

class Bird is Animal
has (LeftWing is Wing)  # a "named" Wing
has (RightWing is Wing)
has (LeftLeg is Leg)
has (RightLeg is Leg);

$bird.LeftWing.flap;# makes sense
$bird.flap; # but what's this do? left, right, or 
_both_?
$bird^.Wing.flap# perhaps too evil?  :-)

MikeL




Re: Interfaces

2002-10-01 Thread Chris Dutton

On Monday, September 30, 2002, at 11:19  PM, Michael G Schwern wrote:

> On Mon, Sep 30, 2002 at 06:04:28PM -0700, David Whipp wrote:
>> On a slightly different note, if we have interfaces then I'd really
>> like to follow the Eiffel model: features such as renaming methods
>> in the derived class may seem a bit strange; but they can be useful
>> if you have have name-conflicts with multiple inheritance.
>
> I'm not familiar with the Eiffel beyond "it's the DBC language and it's
> French", but wouldn't this simply be covered by aliasing?

Eiffel can either rename a "feature"(method, attribute), which is pretty 
much the same as aliasing as you might see it in Ruby, or you can 
redefine the method entirely.  Again, you also would see this in Ruby, 
which might be more approachable for those familiar with Perl.

class BAR
inherit FOO
rename output as old_output
end
end

or...

class BAR
inherit FOO
redefine output
end
end




Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread Peter Behroozi

On Tue, 2002-10-01 at 15:24, Luke Palmer wrote:
> Maybe I'm misundertanding the question, but I think you want negative
> lookahead:
> 
> Perl 5:   /(.*)(?!>union)/

You really meant to say

Perl 5:  /((?:(?!union).))*/  
# Match characters that do not begin the word 'union'

Right?

Peter Behroozi




Re: Paren madness (was Re: Regex query)

2002-10-01 Thread Thomas A. Boyer

David Whipp wrote:
>   $b = 7, 6, 5
>   @b = 7, 6, 5

I understand that C's *interpretation* of the comma operator will be expunged from 
Perl 6. But unless comma's *precedence* is also changing, neither of those statements 
would build a list with three elements.

It seems to me that
  $b = 7, 6, 5;
is the same as
  ($b = 7), 6, 5;
not
  $b = (7, 6, 5);
because '=' binds tighter than ','. So it will assign 7 to $b, and then effectively 
evaluate the statement
  7, 6, 5;
which might build a list and then discard it. I.e., it is akin to these statements:
  [7, 6, 5];
  3 + 4;
  7;
(and equally feckless).

=thom



Re: Interfaces

2002-10-01 Thread Michael Lazzaro


On Tuesday, October 1, 2002, at 02:49  PM, Michael Lazzaro wrote:
> Which implies, I assume, that "interface" is not the default state of 
> a class method, e.g. we do need something like "method foo() is 
> interface { ... }" to declare any given method

Flippin' hell, never mind.  You're almost certainly talking about a 
style like:

interface Vehicle {
method foo () { ... }
method bar () { ... }
}
- or -
class Vehicle is interface {
...
}

 in which case an "interface" is specified as a type of abstract 
class, not an attribute of a given method... I was thinking of 
something like

class Vehicle {
method foo () is interface { ... }
method bar () is interface { ... }
method zap () is private { ... }
}

 in which a specific base class could define "obligatory" method 
signatures for any eventual subclasses.  Never mind on that one, I've 
been thinking too much about a different problem.

MikeL




Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread esp5

On Tue, Oct 01, 2002 at 06:32:07PM -0400, Mike Lambert wrote:
> > guaranteeing that the subsqls have all text up to, but not including the string
> > "union".
> >
> > I suppose I could say:
> >
> > rule nonunion { (.*) :: { fail if ($1 =~ m"union$"); } }
> 
> What's wrong with: ?
> 
> rule getstuffbeforeunion { (.*?) union | (.*) }
> 
> "a union" => "a "
> "b" => "b"
> 
> Am I missing something here?
> 
> Mike Lambert
> 

hmm... well, it works, but its not very efficient. It basically 
scans the whole string to the end to see if there is a "union" string, and 
then backtracks to take the alternative. And hence, its not very scalable. 
It also doesn't 'complexify' very well.

Suppose you had a long string of text, and you wanted to 'harden' your regex
against the substring union appearing in double-quoted strings, single-quoted 
strings, etc. etc, without writing a sql parser. I just don't see how to do this
with ? - I would do something like (taking a page from Mr. Friedl's book ) - 

rule regex_matching_sql 
{
[
<-[u()"']>+ : |
: |
 : |
 : |

]*
}

rule parens
{
\(
[
<-["'()]>+  : |
 : |
 : |
 
]*
\)
}

rule single_string
{
\' [ <-[\'\\]>+ : | \.\' ]* \'
}

rule double_string
{
\" [ <-[\"\\]>+ : | \.\" ]* \"
}

rule non_union {  [ u < - ['"()n] > | un ... | uni ... | unio ... | u$ ] * }

Of course I could also be missing something, but I just don't see how to do this
with .*?. 

Ed

(ps:
As for:

/(.*)  /

I'm not sure how that works; and whether or not its very 'complexifiable' 
(as per above) . If it does a match against every single substring (take all 
characters, look for union, if it exists, roll back a character, do 
the same thing, etc. etc. etc.) then this isn't good enough.  The non_union 
rule listed above is about as efficient as it can get; it does no backtracking,
and it keeps the common matches up front so they match first without 
alternation.
)



Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-01 Thread esp5

On Tue, Oct 01, 2002 at 05:24:43PM -0400, Peter Behroozi wrote:
> On Tue, 2002-10-01 at 16:44, [EMAIL PROTECTED] wrote:
> > doesn't work (just tried it out, not sure why it doesn't) but even if it did,
> > it would be awful slow. It would try one character, look at the next for the 
> > string union, come back for the next character, look for the string union,
> > etc. etc. etc.
> > 
> > whereas
> > 
> > ([^u]+|u[^n])
> > 
> > doesn't do any backtracking at all..
> > 
> > Ed
> 
> perl -e ' $a = "this has the string union in it"; 
> my ($b) = ($a =~ m"((?:(?!union).)*)"); print $b;'
> 
> prints the desired result for me at least.  It also should be comparably

whoops. Must have mistyped. Works for me now.

> efficient to the alternation since the match for the string 'union'
> should fail if the first character is not 'u', etc.  The alternation
> also matches a character at a time except in special cases, where I am
> reasonably sure that the extra overhead from alternation compensates for
> multi-character matching.  This method also does no backtracking for the
> provided example; I am not sure what made you think that it did.
> 
> Peter
> 

well, when I said backtracking, I meant it didn't flip between the current 
character and the next. I couldn't check real numbers doing benchmarking 
because the ?! construct core dumps on both perl-5.6.1 and perl-5.8 on large 
strings.

But when benchmarked on small (30 line strings) using:

my $regex1 = qr{(?:(?!union).)*}sx;
my $regex2 = qr{(?:[^u]+|u[^n]|un[^i]|uni[^o]|unio[^n])*}sx;

timethese
(10,
{   
'questionbang' => sub { my ($b) = ($line =~ m"($regex1)"); },
'alternation'   => sub { my ($b) = ($line =~ m"($regex2)"); }
}
);

I get:

Benchmark: timing 10 iterations of alternation, questionbang...
alternation: 11 wallclock secs (10.66 usr +  0.00 sys = 10.66 CPU) @ 9380.86/s 
(n=10)
questionbang: 18 wallclock secs (18.81 usr +  0.00 sys = 18.81 CPU) @ 5316.32/s 
(n=10)

so ?! is a bit slower. It could probably be made faster though.

However, I'm still skeptical as it being a good replacement for the alternation.
Look at my posted message (about making the regex be able to handle nested 
parens, etc) and see if you can come up with an easy way handle the case I 
mentioned there..

Ed