On Tue, Mar 2, 2021, at 15:05, Isaac Morland wrote:
> Suppose the match results are:
>
> [4,8)
> [10,10)
> [13,16)
> [20,20)
> [24,24)
>
> Then this gets turned into:
>
> [4,8)
> empty
> [13,16)
> empty
> empty
>
> So you know that there are non-empty matches from 4-8 and 13-16, plus an
> empty match between them and two empty matches at the end. Given that all
> empty strings are identical, I think it's only in pretty rare circumstances
> where you need to know exactly where the empty matches are; it would have to
> be a matter of identifying empty matches immediately before or after a
> specific pattern; in which case I suspect it would usually be just as easy to
> match the pattern itself directly.
>
> Does this help?
Thanks, I see what you mean now.
I agree it's probably a corner-case,
but I think I would still prefer a complete solution by just returning setof
two integer[] values,
instead of the cuter-but-only-partial solution of using the existing
int4range[].
Even better would be if we could fix the range type so it could actually be
used in this and other similar situations.
If so, then we could do:
SELECT r FROM regexp_positions('caaabaaabeee','(?<=b)a+','g') AS r;
r
-----------
{"[6,9)"}
(1 row)
SELECT r FROM regexp_positions('caaabaaabeee','(?<=b)','g') AS r;
r
---------
{empty}
{empty}
(2 rows)
SELECT lower(r[1]), upper(r[1]) FROM
regexp_positions('caaabaaabeee','(?<=b)','g') AS r;
lower | upper
-------+-------
5 | 5
9 | 9
(2 rows)
/Joel