"Joel Jacobson" <j...@compiler.org> writes:
> Tom - can you please give details on your unpleasant experiences with 
> parallel arrays?

The problems I can recall running into were basically down to not having
an easy way to iterate through parallel arrays.  There are ways to do
that in SQL, certainly, but they all constrain how you write the query,
and usually force ugly stuff like splitting it into sub-selects.

As an example, presuming that regexp_positions is defined along the
lines of

regexp_positions(str text, pat text, out starts int[], out lengths int[])
returns setof record

then to actually get the identified substrings you'd have to do something
like

select
  substring([input string] from starts[i] for lengths[i])
from
  regexp_positions([input string], [pattern]) r,
  lateral
    generate_series(1, array_length(starts, 1)) i;

I think the last time I confronted this, we didn't have multi-array
UNNEST.  Now that we do, we can get rid of the generate_series(),
but it's still not beautiful:

select
  substring([input string] from s for l)
from
  regexp_positions([input string], [pattern]) r,
  lateral
    unnest(starts, lengths) u(s,l);

Having said that, the other alternative with a 2-D array:

regexp_positions(str text, pat text) returns setof int[]

seems to still need UNNEST, though now it's not the magic multi-array
UNNEST but this slicing version:

select
  substring([input string] from u[1] for u[2])
from
  regexp_positions([input string], [pattern]) r,
  lateral
    unnest_slice(r, 1) u;

Anyway, I'd counsel trying to write out SQL implementations
of regexp_matches() and other useful things based on any
particular regexp_positions() API you might be thinking about.
Can we do anything useful without a LATERAL UNNEST thingie?
Are some of them more legible than others?

                        regards, tom lane


Reply via email to