Mark Dilger <mark.dil...@enterprisedb.com> writes: >> On Aug 9, 2021, at 4:31 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> There is a potentially interesting definitional question: >> what exactly ought this regexp do? >> ((.)){0}\2 >> Because the capturing paren sets are zero-quantified, they will >> never be matched to any characters, so the backref can never >> have any defined referent.
> Perl regular expressions are not POSIX, but if there is a principled reason > POSIX should differ from perl on this, we should be clear what that is: > if ('foo' =~ m/((.)(??{ die; })){0}(..)/) > { > print "captured 1 $1\n" if defined $1; > print "captured 2 $2\n" if defined $2; > print "captured 3 $3\n" if defined $3; > print "captured 4 $4\n" if defined $4; > print "match = $match\n" if defined $match; > } Hm. I'm not sure that this example proves anything about Perl's handling of the situation, since you didn't use a backref. I tried both if ('foo' =~ m/((.)){0}\1/) if ('foo' =~ m/((.)){0}\2/) and while neither throws an error, they don't succeed either. So AFAICS Perl is acting in the way I'm attributing to POSIX. But maybe we should actually read POSIX ... >> ... I guess Spencer did think about this to some extent -- he >> just forgot about the possibility of nested parens. > Ugg. That means our code throws an error where perl does not, pretty > well negating my point above. If we're already throwing an error for > this type of thing, I agree we should be consistent about it. My > personal preference would have been to do the same thing as perl, but it > seems that ship has already sailed. Removing an error case is usually an easier sell than adding one. However, the fact that the simplest case (viz, '(.){0}\1') has always thrown an error and nobody has complained in twenty-ish years suggests that nobody much cares. regards, tom lane