Re: [E] Regexp_replace bug / does not terminate on long strings

Mark Dilger Fri, 20 Aug 2021 13:27:01 -0700

> On Aug 20, 2021, at 12:51 PM, Miles Elam <[email protected]> wrote:
> 
> Unbounded ranges seem like a problem.

Seems so.  The problem appears to be in regcomp.c's repeat() function which 
handles {1,SOME} differently than {1,INF}

> Seems worth trying a range from 1 to N where you play around with N to find 
> your optimum performance/functionality tradeoff. {1,20} is like '+' but 
> clamps at 20.

For any such value (5, 20, whatever) there can always be a string with more 
repeated words than the number you've chosen, and the call to regexp_replace 
won't do what you want.  There is also an upper bound at work, because values 
above 255 will draw a regex compilation error.  So it seems worth a bit of work 
to determine why the regex engine has bad performance in these cases.

It sounds like the OP will be working around this problem by refactoring to 
call regexp_replace multiple times until all repeats are eradicated, but I 
don't think such workarounds should be necessary.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [E] Regexp_replace bug / does not terminate on long strings

Reply via email to