You are right, I also found the same behaviour when using e.g the UNIX sed command.
Ingolf On Mon, Aug 23, 2021 at 4:24 PM Francisco Olarte <fola...@peoplecall.com> wrote: > Ingolf: > > On Mon, Aug 23, 2021 at 2:39 PM Markhof, Ingolf > <ingolf.mark...@de.verizon.com> wrote: > > Yes, When I use (\1)? instead of (\1)+, the expression is evaluated > quickly, but it doesn't return what I want. Once a word is written, it is > not subject to matching again. i.e. > > select regexp_replace( --> remove double entries > > 'one,one,one,two,two,three,three', > > '([^,]+)(,\1)?($|,)', > > '\1\3', > > 'g' > > ) as res; > > > ... > > Honestly, this behaviour seems to be incorrect for me. Once the system > replaces the first two 'one,one,' by a single 'one,', I'd expect to match > this replaced one 'one,' with the next 'one,' following, replacing these > two by another, single 'one,', again... > > I think your expectation is misguided. All the regexp engines I've > used do it this way, when asked to match "g"lobally they do > non-overlapping matches, they do not substitute and recurse with the > modified string. > > Also, your way opens the door to run-away or infinite loops ( > rr('a','a','aa','g') or rr('a','a','a','g'), not to speak of > r('x','','','g') ). Even a misguided r(str, '_+','_','g'), used > sometimes to normalize space runs and similar things, can go into a > loop. > > Francisco Olarte. > ====================================================================== Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des Aufsichtsrats: Francesco de Maio