Hi, When looking into how to implement a global replace of multiple substrings (each with their own replacement) in sql or plpgsql, I'm wondering if/how an RE with an alternation can be used.
The basic idea is to iterate on the rows produced by regexp_matches(string, '(.*?)(foo|bar|foobar)', 'g') to break down the string into pairs of (non-matching segment, matching segment) so that a final result can be assembled from that (setting aside the last non-matching segment, that can be retrieved in a final step). The difficulty is that the longest strings in the alternation should be prioritized, but the starting (.*?) makes the RE non-greedy so "foo" is choosen over "foobar". The doc at [1] leaves me unoptimistic when it mentions that: "...when an RE contains both greedy and non-greedy subexpressions, the total match length is either as long as possible or as short as possible, according to the attribute assigned to the whole RE. The attributes assigned to the subexpressions only affect how much of that match they are allowed to “eat” relative to each other." Also it gives this example of forcing the RE as a whole to be greedy despite it having a non-greedy sub-RE: regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}') but it doesn't seem to be able to produce the desired result in the case of the RE in the middle being an alternation with strings of different lengths. The ideal RE with a (foo|foobar|bar) alternation, when applied globally to a string like 'the string has foo and foobar and bar and more' would produce something like: {"the string has ","foo"} {" and ","foobar"} {" and ","bar"} Is that possible with regexp_matches? [1] https://www.postgresql.org/docs/current/functions-matching.html