J. Cliff Dyer wrote:
On Wed, 2008-07-09 at 12:29 -0700, samwyse wrote:
On Jul 8, 11:01 am, Kris Kennaway <[EMAIL PROTECTED]> wrote:
samwyse wrote:
You might want to look at Plex.
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/
"Another advantage of Plex is that it compiles all of the
Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]>:
> On Mon, 07 Jul 2008 16:44:22 +0200, Sebastian \"lunar\" Wiesner wrote:
>
>> Mark Wooding <[EMAIL PROTECTED]>:
>>
>>> Sebastian "lunar" Wiesner <[EMAIL PROTECTED]> wrote:
>>>
# perl -e '("a" x 10) =~ /^(ab?)*$/;'
zsh: segmentation fau
On Wed, 2008-07-09 at 12:29 -0700, samwyse wrote:
> On Jul 8, 11:01 am, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> > samwyse wrote:
>
> > > You might want to look at Plex.
> > >http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/
> >
> > > "Another advantage of Plex is that it compiles all of
John Machin wrote:
Uh-huh ... try this, then:
http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/
You could use this to find the "Str" cases and the prefixes of the
"re" cases (which seem to be no more complicated than 'foo.*bar.*zot')
and use something slower like Python's re to search the
On Jul 9, 10:06 pm, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> John Machin wrote:
> >> Hmm, unfortunately it's still orders of magnitude slower than grep in my
> >> own application that involves matching lots of strings and regexps
> >> against large files (I killed it after 400 seconds, compared t
samwyse wrote:
On Jul 8, 11:01 am, Kris Kennaway <[EMAIL PROTECTED]> wrote:
samwyse wrote:
You might want to look at Plex.
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/
"Another advantage of Plex is that it compiles all of the regular
expressions into a single DFA. Once that's done
On Jul 8, 11:01 am, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> samwyse wrote:
> > You might want to look at Plex.
> >http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/
>
> > "Another advantage of Plex is that it compiles all of the regular
> > expressions into a single DFA. Once that's done,
Jeroen Ruigrok van der Werven wrote:
-On [20080709 14:08], Kris Kennaway ([EMAIL PROTECTED]) wrote:
It's compiler/build output.
Sounds like the FreeBSD ports build cluster. :)
Yes indeed!
Kris, have you tried a PGO build of Python with your specific usage? I
cannot guarantee it will signif
-On [20080709 14:08], Kris Kennaway ([EMAIL PROTECTED]) wrote:
>It's compiler/build output.
Sounds like the FreeBSD ports build cluster. :)
Kris, have you tried a PGO build of Python with your specific usage? I
cannot guarantee it will significantly speed things up though.
Also, a while ago I di
John Machin wrote:
Hmm, unfortunately it's still orders of magnitude slower than grep in my
own application that involves matching lots of strings and regexps
against large files (I killed it after 400 seconds, compared to 1.5 for
grep), and that's leaving aside the much longer compilation time
On Jul 9, 2:01 am, Kris Kennaway <[EMAIL PROTECTED]> wrote:
> samwyse wrote:
> > On Jul 4, 6:43 am, Henning_Thornblad <[EMAIL PROTECTED]>
> > wrote:
> >> What can be the cause of the large difference between re.search and
> >> grep?
>
> >> While doing a simple grep:
> >> grep '[^ "=]*/' input
On Jul 8, 2:48 am, John Machin <[EMAIL PROTECTED]> wrote:
> On Jul 8, 2:51 am, Henning Thornblad <[EMAIL PROTECTED]>
> wrote:
>
>
>
> > When trying to find an alternative way of solving my problem i found
> > that running this script:
>
> > #!/usr/bin/env python
>
> > import re
>
> > row=""
> > for
samwyse wrote:
On Jul 4, 6:43 am, Henning_Thornblad <[EMAIL PROTECTED]>
wrote:
What can be the cause of the large difference between re.search and
grep?
While doing a simple grep:
grep '[^ "=]*/' input (input contains 156.000 a in
one row)
doesn't even take a second.
Is this
samwyse wrote:
On Jul 4, 6:43 am, Henning_Thornblad <[EMAIL PROTECTED]>
wrote:
What can be the cause of the large difference between re.search and
grep?
While doing a simple grep:
grep '[^ "=]*/' input (input contains 156.000 a in
one row)
doesn't even take a second.
Is this
On Jul 4, 6:43 am, Henning_Thornblad <[EMAIL PROTECTED]>
wrote:
> What can be the cause of the large difference between re.search and
> grep?
> While doing a simple grep:
> grep '[^ "=]*/' input (input contains 156.000 a in
> one row)
> doesn't even take a second.
>
> Is this a bu
On Jul 8, 2:51 am, Henning Thornblad <[EMAIL PROTECTED]>
wrote:
> When trying to find an alternative way of solving my problem i found
> that running this script:
>
> #!/usr/bin/env python
>
> import re
>
> row=""
> for a in range(156000):
> row+="a"
> print "How many, dude?"
> print re.search(
On Jul 8, 2:51 am, Henning Thornblad <[EMAIL PROTECTED]>
wrote:
> When trying to find an alternative way of solving my problem i found
> that running this script:
>
> #!/usr/bin/env python
>
> import re
>
> row=""
> for a in range(156000):
> row+="a"
> print "How many, dude?"
> print re.search(
Paddy wrote:
On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
Henning_Thornblad wrote:
What can be the cause of the large difference between re.search and
grep?
grep uses a smarter algorithm ;)
This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re
ro
When trying to find an alternative way of solving my problem i found
that running this script:
#!/usr/bin/env python
import re
row=""
for a in range(156000):
row+="a"
print "How many, dude?"
print re.search('/[^ "=]*',row) (the / has moved)
wouldn't take even a second (The re.search part o
On Mon, 07 Jul 2008 16:44:22 +0200, Sebastian \"lunar\" Wiesner wrote:
> Mark Wooding <[EMAIL PROTECTED]>:
>
>> Sebastian "lunar" Wiesner <[EMAIL PROTECTED]> wrote:
>>
>>> # perl -e '("a" x 10) =~ /^(ab?)*$/;'
>>> zsh: segmentation fault perl -e '("a" x 10) =~ /^(ab?)*$/;'
>>
>> (Did y
Mark Wooding <[EMAIL PROTECTED]>:
> Sebastian "lunar" Wiesner <[EMAIL PROTECTED]> wrote:
>
>> # perl -e '("a" x 10) =~ /^(ab?)*$/;'
>> zsh: segmentation fault perl -e '("a" x 10) =~ /^(ab?)*$/;'
>
> (Did you really run that as root?)
How come, that you think so?
--
Freedom is always
Sebastian "lunar" Wiesner <[EMAIL PROTECTED]> wrote:
> # perl -e '("a" x 10) =~ /^(ab?)*$/;'
> zsh: segmentation fault perl -e '("a" x 10) =~ /^(ab?)*$/;'
(Did you really run that as root?)
> It'd be interesting to know, how CL-PPCRE performs here (I don't know this
> library).
Stack o
Sebastian "lunar" Wiesner wrote:
I completely agree. I'd just believe, that the combination of some finite
state machine for "classic" expressions with some backtracking code is
terribly hard to implement. But I'm not an expert in this, probably some
libraries out there already do this. In
Mark Wooding <[EMAIL PROTECTED]>:
> Sebastian "lunar" Wiesner <[EMAIL PROTECTED]> wrote:
>
>> I just wanted to illustrate, that the speed of the given search is
>> somehow related to the complexity of the engine.
>>
>> Btw, other pcre implementation are as slow as Python or "grep -P". I
>> tried
Sebastian "lunar" Wiesner <[EMAIL PROTECTED]> wrote:
> I just wanted to illustrate, that the speed of the given search is somehow
> related to the complexity of the engine.
>
> Btw, other pcre implementation are as slow as Python or "grep -P". I tried
> a sample C++-code using pcre++ (a wrapper
On Jul 5, 11:13 am, Mark Dickinson <[EMAIL PROTECTED]> wrote:
> Apparently, grep and Tcl convert a regex to a finite state machine.
...
> But not all PCREs can be converted to a finite state machine
...
> Part of the problem is a lack of agreement on what
> 'regular expression' means. Strictly sp
Terry Reedy <[EMAIL PROTECTED]>:
> Mark Dickinson wrote:
>> On Jul 5, 1:54 pm, Carl Banks <[EMAIL PROTECTED]> wrote:
>
>> Part of the problem is a lack of agreement on what
>> 'regular expression' means.
>
> Twenty years ago, there was. Calling a extended re-derived grammar
> expression like Pe
Mark Dickinson wrote:
On Jul 5, 1:54 pm, Carl Banks <[EMAIL PROTECTED]> wrote:
Part of the problem is a lack of agreement on what
'regular expression' means.
Twenty years ago, there was. Calling a extended re-derived grammar
expression like Perl's a 'regular-expression' is a bit like cal
On Jul 5, 4:13 pm, Mark Dickinson <[EMAIL PROTECTED]> wrote:
> It seems like an appropriate moment to point out *this* paper:
>
> http://swtch.com/~rsc/regexp/regexp1.html
>
That's the one!
Thanks Mark.
- Paddy.
--
http://mail.python.org/mailman/listinfo/python-list
On Jul 5, 1:54 pm, Carl Banks <[EMAIL PROTECTED]> wrote:
> I don't think you've illustrated that at all. What you've illustrated
> is that one implementation of regexp optimizes something that another
> doesn't. It might be due to differences in complexity; it might not.
> (Maybe there's somethin
Paddy:
> You could argue that if the costly RE features are not used then maybe
> simpler, faster algorithms should be automatically swapped in but
Many Python implementations contains a TCL interpreter. TCL REs may be
better than Python ones, so it can be interesting to benchmark the
same RE
On Jul 5, 6:44 am, "Sebastian \"lunar\" Wiesner"
<[EMAIL PROTECTED]> wrote:
> Carl Banks <[EMAIL PROTECTED]>:
>
>
>
> > On Jul 5, 4:12 am, "Sebastian \"lunar\" Wiesner"
> > <[EMAIL PROTECTED]> wrote:
> >> Paddy <[EMAIL PROTECTED]>:
>
> >> > On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
Carl Banks <[EMAIL PROTECTED]>:
> On Jul 5, 4:12 am, "Sebastian \"lunar\" Wiesner"
> <[EMAIL PROTECTED]> wrote:
>> Paddy <[EMAIL PROTECTED]>:
>>
>>
>>
>> > On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
>> >> Henning_Thornblad wrote:
>> >> > What can be the cause of the large difference
On Jul 5, 4:12 am, "Sebastian \"lunar\" Wiesner"
<[EMAIL PROTECTED]> wrote:
> Paddy <[EMAIL PROTECTED]>:
>
>
>
> > On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
> >> Henning_Thornblad wrote:
> >> > What can be the cause of the large difference between re.search and
> >> > grep?
>
> >> g
Paddy <[EMAIL PROTECTED]>:
> On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
>> Henning_Thornblad wrote:
>> > What can be the cause of the large difference between re.search and
>> > grep?
>>
>> grep uses a smarter algorithm ;)
>>
>>
>>
>> > This script takes about 5 min to run on my com
On Jul 5, 7:01 am, Peter Otten <[EMAIL PROTECTED]> wrote:
> Paddy wrote:
> > It is not a smarter algorithm that is used in grep. Python RE's have
> > more capabilities than grep RE's which need a slower, more complex
> > algorithm.
>
> So you're saying the Python algo is alternatively gifted...
>
>
Filipe Fernandes wrote:
> but why would you say this particular
> regex isn't common enough in real code?
As Carl says, it's not just the regex, it's the the combination with a long
line that exposes the re library's weakness.
Peter
--
http://mail.python.org/mailman/listinfo/python-list
Paddy wrote:
> It is not a smarter algorithm that is used in grep. Python RE's have
> more capabilities than grep RE's which need a slower, more complex
> algorithm.
So you're saying the Python algo is alternatively gifted...
Peter
--
http://mail.python.org/mailman/listinfo/python-list
John Nagle wrote:
> Henning_Thornblad wrote:
>> What can be the cause of the large difference between re.search and
>> grep?
>>
>> This script takes about 5 min to run on my computer:
>> #!/usr/bin/env python
>> import re
>>
>> row=""
>> for a in range(156000):
>> row+="a"
>> print re.search
On Fri, 4 Jul 2008 20:34:03 -0700 (PDT), Carl Banks wrote:
> On Jul 4, 4:43 pm, "Filipe Fernandes" <[EMAIL PROTECTED]> wrote:
>> On Fri, Jul 4, 2008 at 8:36 AM, Peter Otten <[EMAIL PROTECTED]> wrote:
>> > Henning_Thornblad wrote:
>>
>> >> This script takes about 5 min to run on my computer:
>> >> #
Henning_Thornblad wrote:
What can be the cause of the large difference between re.search and
grep?
This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re
row=""
for a in range(156000):
row+="a"
print re.search('[^ "=]*/',row)
While doing a simple grep:
grep '
On Jul 4, 4:43 pm, "Filipe Fernandes" <[EMAIL PROTECTED]> wrote:
> On Fri, Jul 4, 2008 at 8:36 AM, Peter Otten <[EMAIL PROTECTED]> wrote:
> > Henning_Thornblad wrote:
>
> >> What can be the cause of the large difference between re.search and
> >> grep?
>
> > grep uses a smarter algorithm ;)
>
> >>
On Fri, Jul 4, 2008 at 8:36 AM, Peter Otten <[EMAIL PROTECTED]> wrote:
> Henning_Thornblad wrote:
>
>> What can be the cause of the large difference between re.search and
>> grep?
>
> grep uses a smarter algorithm ;)
>
>> This script takes about 5 min to run on my computer:
>> #!/usr/bin/env python
On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
> Henning_Thornblad wrote:
> > What can be the cause of the large difference between re.search and
> > grep?
>
> grep uses a smarter algorithm ;)
>
>
>
> > This script takes about 5 min to run on my computer:
> > #!/usr/bin/env python
> > im
Henning_Thornblad wrote:
> What can be the cause of the large difference between re.search and
> grep?
grep uses a smarter algorithm ;)
> This script takes about 5 min to run on my computer:
> #!/usr/bin/env python
> import re
>
> row=""
> for a in range(156000):
> row+="a"
> print re.sear
Bruno Desthuilliers a écrit :
Henning_Thornblad a écrit :
What can be the cause of the large difference between re.search and
grep?
This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re
row=""
for a in range(156000):
row+="a"
print re.search('[^ "=]*/',row)
Henning_Thornblad a écrit :
What can be the cause of the large difference between re.search and
grep?
This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re
row=""
for a in range(156000):
row+="a"
print re.search('[^ "=]*/',row)
While doing a simple grep:
gre
47 matches
Mail list logo