Mark Dilger writes:
> I've beaten on this with random patterns and it seems to hold up just fine.
> I have also reviewed the diffs and, for the patterns where the output
> changes, everything looks correct. I can't find anything wrong with this
> patch.
Thanks for testing! I'll push it in a
> On Aug 9, 2021, at 7:20 PM, Tom Lane wrote:
>
> So I took another look at the code, and it doesn't seem that hard
> to make it act this way. The attached passes regression, but
> I've not beat on it with random strings.
> alternate-fix-zero-quantified-nested-parens.patch
I've beaten on thi
> On Aug 9, 2021, at 7:20 PM, Tom Lane wrote:
>
> So I took another look at the code, and it doesn't seem that hard
> to make it act this way. The attached passes regression, but
> I've not beat on it with random strings.
I expect to get back around to testing this in a day or so.
—
Mark Di
I wrote:
> So AFAICS Perl is acting in the way I'm attributing to POSIX.
> But maybe we should actually read POSIX ...
I went to look at the POSIX spec, and was reminded that it lacks
backrefs altogether. (POSIX specifies the "BRE" and "ERE" regex
flavors as described in our docs, but not "ARE".)
Mark Dilger writes:
> I ran a lot of tests with the patch, and the asserts have all cleared up, but
> I don't know how to think about the user facing differences. If we had a
> good reason for raising an error for these back-references, maybe that'd be
> fine, but it seems to just be an implem
> On Aug 9, 2021, at 4:31 PM, Tom Lane wrote:
>
> This patch should work OK in HEAD and v14, but it will need
> a bit of fooling-about for older branches I think, given that
> they fill v->subs[] a little differently.
Note that I tested your patch *before* master, so the changes look backward
> On Aug 9, 2021, at 6:17 PM, Mark Dilger wrote:
>
> Well, this doesn't die either:
Meaning it doesn't die in the part of the pattern qualified by {0} either. It
does die in the other part. Sorry again for the confusion.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterpri
> On Aug 9, 2021, at 6:11 PM, Tom Lane wrote:
>
> Hm. I'm not sure that this example proves anything about Perl's handling
> of the situation, since you didn't use a backref.
Well, this doesn't die either:
if ('foo' =~ m/((??{ die; })(.)(??{ die $1; })){0}((??{ die "got here"; })|\2)/)
{
Mark Dilger writes:
>> On Aug 9, 2021, at 4:31 PM, Tom Lane wrote:
>> There is a potentially interesting definitional question:
>> what exactly ought this regexp do?
>> ((.)){0}\2
>> Because the capturing paren sets are zero-quantified, they will
>> never be matched to any characters, so the
Mark Dilger writes:
>> On Aug 9, 2021, at 12:14 PM, Tom Lane wrote:
>> Pushed, but while re-reading it before commit I noticed that there's
>> some more fairly low-hanging fruit in regexp_replace().
> I've been reviewing and testing this (let-regexp_replace-use-NOSUB.patch)
> since you sent it
> On Aug 9, 2021, at 5:14 PM, Mark Dilger wrote:
>
>our $match;
>if ('foo' =~ m/((.)(??{ die; })){0}(..)/)
I left in a stray variable. A prior version of this script was assigning to
$match where it now has die. Sorry for any confusion.
—
Mark Dilger
EnterpriseDB: http://www.enter
> On Aug 9, 2021, at 4:31 PM, Tom Lane wrote:
>
> There is a potentially interesting definitional question:
> what exactly ought this regexp do?
>
> ((.)){0}\2
>
> Because the capturing paren sets are zero-quantified, they will
> never be matched to any characters, so the backr
> On Aug 9, 2021, at 12:14 PM, Tom Lane wrote:
>
> Pushed, but while re-reading it before commit I noticed that there's
> some more fairly low-hanging fruit in regexp_replace(). As I had it
> in that patch, it never used REG_NOSUB because of the possibility
> that the replacement string uses
Mark Dilger writes:
> +select regexp_split_to_array('', '(?:((?:q+))){0}(\1){0,0}?*[^]');
> +server closed the connection unexpectedly
Here's a quick draft patch for this. Basically it moves the
responsibility for clearing v->subs[] pointers into the freesubre()
recursion, so that it will happen
I wrote:
> Hmmm ... yeah, I see it too. This points up something I'd wondered
> about before, which is whether the code that "cancels everything"
> after detecting {0} is really OK. It throws away the outer subre
> *and children* without worrying about what might be inside, and
> here we see that
Mark Dilger writes:
> I can still trigger the old bug for which we thought we'd pushed a fix. The
> test case below crashes on master (e12694523e7e4482a052236f12d3d8b58be9a22c),
> and also on the fixed version "Make regexp engine's backref-related
> compilation state more bulletproof."
> (cb7
Tom,
I can still trigger the old bug for which we thought we'd pushed a fix. The
test case below crashes on master (e12694523e7e4482a052236f12d3d8b58be9a22c),
and also on the fixed version "Make regexp engine's backref-related compilation
state more bulletproof." (cb76fbd7ec87e44b3c53165d68dc2
Mark Dilger writes:
> The patch looks ready to commit. I don't expect to test it any further
> unless you have something in particular you'd like me to focus on.
Pushed, but while re-reading it before commit I noticed that there's
some more fairly low-hanging fruit in regexp_replace(). As I ha
> On Aug 8, 2021, at 3:28 PM, Tom Lane wrote:
>
> Cool, thanks. I also tried your millions-of-random-regexps script
> and didn't find any difference between the results from HEAD and
> those from the v3 patch.
The patch looks ready to commit. I don't expect to test it any further unless
yo
Mark Dilger writes:
> I have applied your latest patch and do not see any problems with it. All my
> tests pass with no asserts and with no differences in results vs. master.
> This is a test suite of nearly 1.5 million separate regular expressions.
Cool, thanks. I also tried your millions-o
> On Aug 8, 2021, at 1:25 PM, Tom Lane wrote:
>
> Ugh. The regex engine is finding the match correctly, but it's failing to
> tell the caller where it is :-(. I was a little too cute in optimizing
> the regmatch_t result-vector copying in pg_regexec, and forgot to ensure
> that the overall m
Mark Dilger writes:
> Hmm. This changes the behavior when applied against master
> (c1132aae336c41cf9d316222e525d8d593c2b5d2):
> select regexp_split_to_array('uuuzkodphfbfbfb', '((.))(\1\2)', 'ntw');
> regexp_split_to_array
> ---
> - {"",zkodphfbfbfb}
> + {uuuzkodphfbfbf
> On Aug 8, 2021, at 10:04 AM, Tom Lane wrote:
>
> I've also rebased over the bug fixes from the other thread,
> and added a couple more test cases.
>
> regards, tom lane
Hmm. This changes the behavior when applied against master
(c1132aae336c41cf9d316222e525d8d593c2b
Mark Dilger writes:
> The patch triggers an assertion that master does not:
> +select 'azrlfkjbjgidgryryiglcabkgqluflu' !~ '(.(.)((.)))((?:(\3)))';
On looking into this, it's pretty simple: regexec.c has an assertion
that a pure-capture subre node ought to be doing some capturing.
case
> On Aug 5, 2021, at 7:36 AM, Tom Lane wrote:
>
> Probably should add more cases...
The patch triggers an assertion that master does not:
+select 'azrlfkjbjgidgryryiglcabkgqluflu' !~ '(.(.)((.)))((?:(\3)))';
+server closed the connection unexpectedly
+ This probably means the server termin
On 8/5/21 10:39 AM, Robert Haas wrote:
> On Thu, Aug 5, 2021 at 9:43 AM Andrew Dunstan wrote:
>> On 8/4/21 6:15 PM, Tom Lane wrote:
>>> Here's a little finger exercise that improves a case that's bothered me
>>> for awhile. In a POSIX regexp, parentheses cause capturing by default;
>>> you have
Robert Haas writes:
> Well, I consider myself a pretty fair perl programmer, and I know
> there's a way to do that, but I never do it, and I would have had to
> look up the exact syntax. So +1 from me for anything automatic that
> avoids paying the overhead in some cases.
That's my feeling about
On Thu, Aug 5, 2021 at 9:43 AM Andrew Dunstan wrote:
> On 8/4/21 6:15 PM, Tom Lane wrote:
> > Here's a little finger exercise that improves a case that's bothered me
> > for awhile. In a POSIX regexp, parentheses cause capturing by default;
> > you have to write the very non-obvious "(?:...)" if
Andrew Dunstan writes:
> I'm a bit worried about how you'll keep track of back-ref numbering
> since back-refs only count capturing groups, and you're silently turning
> a capturing group into a non-capturing group.
They're already numbered at this point, and we aren't changing the numbers
of the
On 8/4/21 6:15 PM, Tom Lane wrote:
> Here's a little finger exercise that improves a case that's bothered me
> for awhile. In a POSIX regexp, parentheses cause capturing by default;
> you have to write the very non-obvious "(?:...)" if you don't want the
> matching substring to be reported by th
30 matches
Mail list logo