Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-04 Thread Heikki Linnakangas
On 04.04.2013 03:32, Noah Misch wrote: On Wed, Apr 03, 2013 at 08:09:15PM +0300, Heikki Linnakangas wrote: --- a/src/include/regex/regguts.h +++ b/src/include/regex/regguts.h @@ -148,6 +148,7 @@ typedef short color; /* colors of characters */ typedef int pcolor;

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Noah Misch
On Wed, Apr 03, 2013 at 08:09:15PM +0300, Heikki Linnakangas wrote: > --- a/src/include/regex/regguts.h > +++ b/src/include/regex/regguts.h > @@ -148,6 +148,7 @@ > typedef short color; /* colors of characters */ > typedef int pcolor; /* what color promotes

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Tom Lane
Heikki Linnakangas writes: > Attached is a patch to add the overflow check. I used the error message > "too many distinct characters in regex". That's not totally accurate, > because there isn't a limit on distinct characters per se, but on the > number of colors. Conceivably, you could have a

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Heikki Linnakangas
On 03.04.2013 18:41, Tom Lane wrote: Heikki Linnakangas writes: On 03.04.2013 18:21, Tom Lane wrote: Obviously Henry didn't think that far ahead. I agree that throwing an error is the best solution, and that widening "color" is probably not what we want to do. You want to fix that, or shall

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Tom Lane
Heikki Linnakangas writes: > On 03.04.2013 18:21, Tom Lane wrote: >> Obviously Henry didn't think that far ahead. I agree that throwing >> an error is the best solution, and that widening "color" is probably >> not what we want to do. You want to fix that, or shall I? > I can do it. I assume th

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Heikki Linnakangas
On 03.04.2013 18:21, Tom Lane wrote: Heikki Linnakangas writes: A regex with that many different colors is an extreme case, so I think it's enough to turn the assertion in newcolor() into a run-time check, and throw a "too many colors in regexp" error. Alternatively, we could expand 'color' fro

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Tom Lane
Heikki Linnakangas writes: > A regex with that many different colors is an extreme case, so I think > it's enough to turn the assertion in newcolor() into a run-time check, > and throw a "too many colors in regexp" error. Alternatively, we could > expand 'color' from short to int, but that woul

[HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Heikki Linnakangas
While playing with Alexander's pg_trgm regexp patch, I noticed that the regexp library trips an assertion (if enabled) or crashes, when passed an input string that contains more than 32k different characters: select 'foo' ~ (select string_agg(chr(x),'') from generate_series(100, 35000) x) as n