Stephane Chazelas wrote:
Why would it make it slower. AFAICT, PCRE_MULTILINE *adds*
some overhead.
After looking into it I now remember why PCRE_MULTILINE speeds things up. See:
http://git.savannah.gnu.org/cgit/grep.git/commit/?id=f6603c4e1e04dbb87a7232c4b44acc6afdf65fef
where using PCRE_MULT
Stephane Chazelas wrote:
one can
use (?m) if he wants ^ to match the beginning of each line in
the NUL-delimited record instead of just the beginning of the
record.
I think the intent is that ^ and $ should match only the line-terminator
specified by -z (or by -z's absence). So the sort of usa
On Sat, Nov 19, 2016 at 12:36:12AM -0800, Paul Eggert wrote:
Stephane Chazelas wrote:
one can
use (?m) if he wants ^ to match the beginning of each line in
the NUL-delimited record instead of just the beginning of the
record.
I think the intent is that ^ and $ should match only the
line-termi
Zev Weiss wrote:
I see 'reflags' being tested in Pexecute(), but I don't see it getting set
anywhere
Oops, that somehow got lost during merging. Thanks for catching that. Omitting
the initialization caused hurt performance due to a failure to use
PCRE_MULTILINE but did not cause a correctnes
This turned into more work than I expected, as I kept finding performance
glitches and/or correctness bugs in the neighborhood. I installed the attached
set of patches. Patch 03 is the crucial one. Patch 10 trivially fixes an earlier
test of mine and I'm too lazy to write a separate email for it
vampyre...@gmail.com wrote:
The documentation also states that -z affects only how input is interpreted.
I don't see where it does that, as the current documentation for -z talks about
both "input and output data". That being said, the manual could be clearer. I
installed the attached.
From 4
2016-11-18 15:37:16 -0800, Paul Eggert:
[...]
> >That might have been the case a long time ago, as I remember
> >some discussion about it as it explained some wrong information
> >in the documentation, but as far as I and gdb can tell, grep
> >2.26 at least call pcre_exec for every line of the inpu
No further comment, and I merged and installed the patches and am closing this
bug report.
Stephane Chazelas wrote:
I don't know the details of why it's done that way, but I'm not
sure I can see how calling pcre_exec that way can be quicker
than calling it on each individual line/record.
It can be hundreds of times faster in common cases. See:
http://git.savannah.gnu.org/cgit/grep.
Paul Eggert wrote:
> Stephane Chazelas wrote:
>> Removing PCRE_MULTILINE (and get back to calling pcre_exec on
>> every record separately) would help except in the cases where the
>> user does:
>>
>> grep -xzP '(?m)a'
>
> I don't think grep can address this problem, as in general that would
> requ
2016-11-19 03:22:23 -0800, Paul Eggert:
> Stephane Chazelas wrote:
>
> >I don't know the details of why it's done that way, but I'm not
> >sure I can see how calling pcre_exec that way can be quicker
> >than calling it on each individual line/record.
>
> It can be hundreds of times faster in comm
2016-11-19 16:14:28 +, Stephane Chazelas:
[...]
> AFAICT, that optimisation only brings a little optimisation in
> some cases (and reduces performance in others) but also reduces
> functionality and breaks users' expectations. IMO, it's not
> worth it.
[...]
To be fair, that last part about us
2016-11-19 16:45:29 +, Stephane Chazelas:
> 2016-11-19 16:14:28 +, Stephane Chazelas:
> [...]
> > AFAICT, that optimisation only brings a little optimisation in
> > some cases (and reduces performance in others) but also reduces
> > functionality and breaks users' expectations. IMO, it's no
Aaron Crane wrote:
I'm not sure it's ideal to use the Perl documentation as
a comprehensive guide to PCRE:
Unfortunately libpcre does not document the regular expression syntax it
supports. Neither does the grep manual, for 'grep -P'. And as you've mentioned,
the Perl manual isn't a reliable
On Sat, Nov 19, 2016 at 1:47 AM, Paul Eggert wrote:
> This turned into more work than I expected, as I kept finding performance
> glitches and/or correctness bugs in the neighborhood. I installed the
> attached set of patches. Patch 03 is the crucial one. Patch 10 trivially
> fixes an earlier test
2016-11-19 15:12:41 -0800, Paul Eggert:
> Aaron Crane wrote:
> >I'm not sure it's ideal to use the Perl documentation as
> >a comprehensive guide to PCRE:
>
> Unfortunately libpcre does not document the regular expression
> syntax it supports. Neither does the grep manual, for 'grep -P'. And
> as
Stephane Chazelas wrote:
You've missed "man pcrepattern".
Yes I did. Thanks. Google was not my friend there.
Stephane Chazelas wrote:
I don't find a x220 factor, more like a x2.5 factor:
I think I found the factor-of-hundreds slowdown, and fixed it in the 2nd
attached patch.
When I tried your benchmark with pcregrep (pcre 8.39, configured with
--enable-unicode-properties), and with ./grep0 (which
18 matches
Mail list logo