Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

Joel Jacobson Tue, 02 Mar 2021 06:22:07 -0800

On Tue, Mar 2, 2021, at 15:05, Isaac Morland wrote:
> Suppose the match results are:
> 
> [4,8)
> [10,10)
> [13,16)
> [20,20)
> [24,24)
> 
> Then this gets turned into:
> 
> [4,8)
> empty
> [13,16)
> empty
> empty
> 
> So you know that there are non-empty matches from 4-8 and 13-16, plus an 
> empty match between them and two empty matches at the end. Given that all 
> empty strings are identical, I think it's only in pretty rare circumstances 
> where you need to know exactly where the empty matches are; it would have to 
> be a matter of identifying empty matches immediately before or after a 
> specific pattern; in which case I suspect it would usually be just as easy to 
> match the pattern itself directly.
> 
> Does this help?


Thanks, I see what you mean now.

I agree it's probably a corner-case,
but I think I would still prefer a complete solution by just returning setof 
two integer[] values,
instead of the cuter-but-only-partial solution of using the existing 
int4range[].

Even better would be if we could fix the range type so it could actually be 
used in this and other similar situations.

If so, then we could do:

SELECT r FROM regexp_positions('caaabaaabeee','(?<=b)a+','g') AS r;
     r
-----------
{"[6,9)"}
(1 row)

SELECT r FROM regexp_positions('caaabaaabeee','(?<=b)','g') AS r;
    r
---------
{empty}
{empty}
(2 rows)

SELECT lower(r[1]), upper(r[1]) FROM 
regexp_positions('caaabaaabeee','(?<=b)','g') AS r;
lower | upper
-------+-------
     5 |     5
     9 |     9
(2 rows)

/Joel

Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

Reply via email to