Hackers,

Please find attached an improvement to the error messages given for invalid 
backreference usage:

 select 'xyz' ~ '(.)(.)\3';
 ERROR:  invalid regular expression: invalid backreference number
 select 'xyz' ~ '(.)(.)(?=\2)';
-ERROR:  invalid regular expression: invalid backreference number
+ERROR:  invalid regular expression: backreference in lookaround assertion

The first regexp is invalid because only two capture groups exist, so \3 
doesn't refer to anything.  The second regexp is rejected because the regular 
expression system does not support backreferences within lookaround assertions. 
 (See the docs, section 9.7.3.6. Limits And Compatibility.)  It is flat wrong 
to say the backreference number is invalid.  There is a perfectly valid capture 
that \2 refers to. 

The patch defines a new error code REG_ENOBREF in regex/regex.h right next to 
REG_ESUBREG from which it is split out, rather than at the end of the list.  Is 
there a project preference to add it at the end?  Certainly, that would give a 
shorter git diff.

Are there dependencies on the current error messages which prevent such changes?

Attachment: v1-0001-Distinguishing-regular-expression-backref-errors.patch
Description: Binary data

 
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Reply via email to