Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

Andres Freund Sat, 09 May 2015 15:45:53 -0700

Hi,

> The attached updated patch reduces both of those do-loop tests to about
> 60 msec on my machine.  It contains two improvements over the 1.1 patch:


Looking at this. First reading the patch to understand the details.

* The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to
  beneficial, before the compiler could implement the whole thing as a
  computed goto or lookup table, afterwards not.
* It'd be nice if the get_flat_size comment in expandeddatm.h could
  specify whether the header size is included. That varies enough around
  toast that it seems worthwhile.
* You were rather bothered by the potential of multiple evaluations for
  the ilist stuff. And now the AARR macros are full of them...
* I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't
  buy the argument that turning them into functions will be slower. I'd
  bet the contrary on common platforms.
* Not a fan of the EH_ prefix in array_expanded.c and EOH_
  elsewhere. Just looks ugly to me. Whatever.
* The list of hardwired safe ops in exec_check_rw_parameter is somewhat
  sad. Don't have a better idea though.
* "Also, a C function that is modifying a read-write expanded value
  in-place should take care to leave the value in a sane state if it
  fails partway through." - that's a pretty hefty requirement imo.  I
  wonder if it'd not be possible to convert RW to RO if a value
  originates from outside an exception block.  IIRC that'd be useful for
  a bunch of other error cases we currently basically shrug away
  (something around toast and aborted xacts comes to mind).
* The forced RW->RO conversion in subquery scans is a bit sad, but I
  seems like something left for later.

These are more judgement calls than anything else...

Somewhere in the thread you comment on the fact that it's a bit sad that
plpgsql is the sole benefactor of this (unless some function forces
expansion internally).  I'd be ok to leave it at that for now. It'd be
quite cool to get some feedback from postgis folks about the suitability
of this for their cases.

I've not really looked into performance improvements around this,
choosing to look into somewhat reasonable cases where it'll
regress.

ISTM that the worst case for the new situation is large arrays that
exist as plpgsql variables but are only ever passed on. Say e.g. a
function that accepts an array among other parameters and passes it on
to another function.

As rather extreme case of this:
CREATE OR REPLACE FUNCTION plpgsql_array_length(p_a anyarray)
RETURNS int LANGUAGE plpgsql AS $$
BEGIN
    RETURN array_length(p_a, 1);
END; $$;

SELECT plpgsql_array_length(b.arr)
FROM (SELECT array_agg(d) FROM generate_series(1, 10000) d) b(arr),
   generate_series(1, 100000) repeat;
with \o /dev/null redirecting the output.

in an assert build it goes from 325.511 ms to 655.733 ms
optimized from 94.648 ms to 287.574 ms.

Now this is a fairly extreme example; and I don't think it'll get much
worse than that. But I do think there's a bunch of cases where values
exist in plpgsql that won't actually be accessed. Say, e.g. return
values from queries that are then conditionally returned and such.

I'm not sure it's possible to do anything about that. Expanding only in
cases where it'd be beneficial is going to be hard.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

Reply via email to