> Agh, if you go and do that, you must then be sure that rx is capable of
> optimizing /a/i and /[aA]/ in the same way. What I mean is that Perl's
> current regex engine is able to use /abc/i as a "constant" in a string,
> while it cannot do the same for /[Aa][Bb][Cc]/. Why? Because in the
> fi
On Jan 31, Hong Zhang said:
>> But as you say, case folding is expensive. And with this approach you
>> are going to case-fold every string that is matched against an rx
>> that has some part of it that is case-insensitive.
>
>That is correct in general. But regex compiler can be smarter than tha
On Thu, Jan 31, 2002 at 12:50:52PM -0800, Brent Dax wrote:
>
> Let me know if I'm brilliant, on crack, or both with this idea.
I've no idea :-)
Tim.
--- Brent Dax <[EMAIL PROTECTED]> wrote:
> Tim Bunce:
> # On Thu, Jan 31, 2002 at 05:15:49PM +, Graham Barr wrote:
> #
> # Especially as the perl6 rx engine will have to be able to
> # work directly on
> # non-trivial things like streams and generators ans suchlike.
>
> I have a suggestion si
Tim Bunce:
# On Thu, Jan 31, 2002 at 05:15:49PM +, Graham Barr wrote:
# >
# > Yes, I was assuming that. However what is to be gained by case
# > folding the input string ?
# >
# > Because parts of an rx can be case-insensitive while other parts
# > are case-sensitive, we will probably need two
> But as you say, case folding is expensive. And with this approach you
> are going to case-fold every string that is matched against an rx
> that has some part of it that is case-insensitive.
That is correct in general. But regex compiler can be smarter than that.
For example, rx should optimize
On Thu, Jan 31, 2002 at 11:18:58AM -0800, Hong Zhang wrote:
> > Because parts of an rx can be case-insensitive while other parts
> > are case-sensitive, we will probably need two sorts of ops anyway
> > (or a way to tell the op to be case-insensitive). And you will
> > only be able to do the case
> Because parts of an rx can be case-insensitive while other parts
> are case-sensitive, we will probably need two sorts of ops anyway
> (or a way to tell the op to be case-insensitive). And you will
> only be able to do the case folding when the whole rx is
> case-insensitive.
I don't like you
On Thu, Jan 31, 2002 at 05:15:49PM +, Graham Barr wrote:
>
> Yes, I was assuming that. However what is to be gained by case
> folding the input string ?
>
> Because parts of an rx can be case-insensitive while other parts
> are case-sensitive, we will probably need two sorts of ops anyway
>
On Thu, Jan 31, 2002 at 08:54:21AM -0800, Brent Dax wrote:
> Peter Haworth:
> # On Wed, 30 Jan 2002 17:45:58 +, Graham Barr wrote:
> # > On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote:
> # > > # rx_setprops P0, "i", 2
> # > > # branch $start0
> # > >
Peter Haworth:
# On Wed, 30 Jan 2002 17:45:58 +, Graham Barr wrote:
# > On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote:
# > > # rx_setprops P0, "i", 2
# > > # branch $start0
# > > # $advance:
# > > # rx_advance P0, $fail
# > >
On Wed, 30 Jan 2002 17:45:58 +, Graham Barr wrote:
> On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote:
> > # rx_setprops P0, "i", 2
> > # branch $start0
> > # $advance:
> > # rx_advance P0, $fail
> > # $start0:
> > #
On Wednesday 30 January 2002 21:42, Dan Sugalski wrote:
> I think we may want trees as a fundamental data type at some point...
I wonder about the trees
--
Bryan C. Warnock
[EMAIL PROTECTED]
At 6:28 PM -0800 1/30/02, Steve Fink wrote:
>I'm sure in Apoc 5 Larry's going to go way beyond that and embed full
>parsers, not just regularish language matchers, but the above is
>easier to grasp.
Odds are, yes. And don't be surprised if the RE engine's required to
return data structures as we
On Wed, Jan 30, 2002 at 08:37:30PM -0500, Bryan C. Warnock wrote:
> "But if you know they're going to be twenty times slower, why are you doing
> it?" Because we know / think / hope / pray / have been making sacrifices to
Tangential note: current benchmarking indicates that we're doing a lot
b
On Wednesday 30 January 2002 11:13, Ashley Winters wrote:
> First, we set the rx engine to case-insensitive. Why is that bad? It's
> setting a runtime property for what should be compile-time
{snip}
> Now, the current CVS rx engine is/would do this at runtime.
We're also currently a compiler sh
On Wednesday 30 January 2002 12:32, Brent Dax wrote:
> # Mostly, I'd like to hear how either Unicode character-ranges aren't
> # deterministic at compile-time (I doubt that) or how crippling to
>
> One word: locale.
Not that locales couldn't provide pre-compiled character classes.
--
Bryan C. W
On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote:
> # rx_setprops P0, "i", 2
> # branch $start0
> # $advance:
> # rx_advance P0, $fail
> # $start0:
> # rx_literal P0, "a", $advance
> #
> # First, we set the rx
begin quote from Ashley Winters:
> I think that's exactly what you should be doing! Neither parrot nor the
> rx engine should try to be a full compiler. The rx engine definitely
> should have opcodes in the virtual machine, but those opcodes should
> simply contain state-machine/backtracking info,
Ashley Winters:
# Who the hell am I?
# I've been only a weblog-lurker till now. It's been a couple
# years since
# I last contributed to Perl5. I just read the latest Apocalypse and it
# inspired me to get a parrot snapshot and look around.
Welcome back to the land of the living. :^)
# What's m
Ashley Winters wrote:
>First, we set the rx engine to case-insensitive. Why is that bad? It's
>setting a runtime property for what should be compile-time
>unicode-character-kung-fu. Assuming your "CPU" knows what the gritty
>details of unicode in the first place just feels wrong, but I digress.
>Basically, I see a black-box being built in the interests of speed.
>Voodoo array formats, bitmaps, and other such things to avoid actually
>spelling out what the regular expression is doing *in parrot code*.
[snip]
>What I see is that rx_literal is a speed hack to avoid compiling this
>into par
On Wed, Jan 30, 2002 at 08:13:55AM -0800, Ashley Winters wrote:
> I think that's exactly what you should be doing! Neither parrot nor the
> rx engine should try to be a full compiler. The rx engine definitely
> should have opcodes in the virtual machine, but those opcodes should
> simply contain s
23 matches
Mail list logo