> On Aug 20, 2021, at 12:51 PM, Miles Elam <miles.e...@productops.com> wrote:
>
> Unbounded ranges seem like a problem.
Seems so. The problem appears to be in regcomp.c's repeat() function which
handles {1,SOME} differently than {1,INF}
> Seems worth trying a range from 1 to N where you play around with N to find
> your optimum performance/functionality tradeoff. {1,20} is like '+' but
> clamps at 20.
For any such value (5, 20, whatever) there can always be a string with more
repeated words than the number you've chosen, and the call to regexp_replace
won't do what you want. There is also an upper bound at work, because values
above 255 will draw a regex compilation error. So it seems worth a bit of work
to determine why the regex engine has bad performance in these cases.
It sounds like the OP will be working around this problem by refactoring to
call regexp_replace multiple times until all repeats are eradicated, but I
don't think such workarounds should be necessary.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company