I've been working through this patch, and trying to verify compatibility against Oracle and DB2, and I see some points that need discussion or at least recording for the archives.
* In Oracle, while the documentation for regexp_instr says that return_option should only be 0 or 1, experimentation with sqlfiddle shows that any nonzero value is silently treated as 1. The patch raises an error for other values, which I think is a good idea. (IBM's docs say that DB2 raises an error too, though I can't test that.) We don't need to be bug-compatible to that extent. * What should happen when the subexpression/capture group number of regexp_instr or regexp_substr exceeds the number of parenthesized subexpressions of the regexp? Oracle silently returns a no-match result (0 or NULL), as does this patch. However, IBM's docs say that DB2 raises an error. I'm inclined to think that this is likewise taking bug-compatibility too far, and that we should raise an error like DB2. There are clearly cases where throwing an error would help debug a faulty call, while I'm less clear on a use-case where not throwing an error would be useful. * IBM's docs say that both regexp_count and regexp_like have arguments "string, pattern [, start] [, flags]" --- that is, each of start and flags can be independently specified or omitted. The patch follows Oracle, which has no start option for regexp_like, and where you can't write flags for regexp_count without writing start. This is fine by me, because doing these like DB2 would introduce the same which-argument-is-this issues as we're being forced to cope with for regexp_replace. I don't think we need to accept ambiguity in these cases too. But it's worth memorializing this decision in the thread. * The patch has most of these functions silently ignoring the 'g' flag, but I think they should raise errors instead. Oracle doesn't accept a 'g' flag for these, so why should we? The only case where that logic doesn't hold is regexp_replace, because depending on which syntax you use the 'g' flag might or might not be meaningful. So for regexp_replace, I'd vote for silently ignoring 'g' if the occurrence-number parameter is given, while honoring it if not. I've already made changes in my local copy per the last item, but I've not done anything about throwing errors for out-of-range subexpression numbers. Anybody have an opinion about that one? regards, tom lane