Hi, > The attached updated patch reduces both of those do-loop tests to about > 60 msec on my machine. It contains two improvements over the 1.1 patch:
Looking at this. First reading the patch to understand the details. * The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to beneficial, before the compiler could implement the whole thing as a computed goto or lookup table, afterwards not. * It'd be nice if the get_flat_size comment in expandeddatm.h could specify whether the header size is included. That varies enough around toast that it seems worthwhile. * You were rather bothered by the potential of multiple evaluations for the ilist stuff. And now the AARR macros are full of them... * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't buy the argument that turning them into functions will be slower. I'd bet the contrary on common platforms. * Not a fan of the EH_ prefix in array_expanded.c and EOH_ elsewhere. Just looks ugly to me. Whatever. * The list of hardwired safe ops in exec_check_rw_parameter is somewhat sad. Don't have a better idea though. * "Also, a C function that is modifying a read-write expanded value in-place should take care to leave the value in a sane state if it fails partway through." - that's a pretty hefty requirement imo. I wonder if it'd not be possible to convert RW to RO if a value originates from outside an exception block. IIRC that'd be useful for a bunch of other error cases we currently basically shrug away (something around toast and aborted xacts comes to mind). * The forced RW->RO conversion in subquery scans is a bit sad, but I seems like something left for later. These are more judgement calls than anything else... Somewhere in the thread you comment on the fact that it's a bit sad that plpgsql is the sole benefactor of this (unless some function forces expansion internally). I'd be ok to leave it at that for now. It'd be quite cool to get some feedback from postgis folks about the suitability of this for their cases. I've not really looked into performance improvements around this, choosing to look into somewhat reasonable cases where it'll regress. ISTM that the worst case for the new situation is large arrays that exist as plpgsql variables but are only ever passed on. Say e.g. a function that accepts an array among other parameters and passes it on to another function. As rather extreme case of this: CREATE OR REPLACE FUNCTION plpgsql_array_length(p_a anyarray) RETURNS int LANGUAGE plpgsql AS $$ BEGIN RETURN array_length(p_a, 1); END; $$; SELECT plpgsql_array_length(b.arr) FROM (SELECT array_agg(d) FROM generate_series(1, 10000) d) b(arr), generate_series(1, 100000) repeat; with \o /dev/null redirecting the output. in an assert build it goes from 325.511 ms to 655.733 ms optimized from 94.648 ms to 287.574 ms. Now this is a fairly extreme example; and I don't think it'll get much worse than that. But I do think there's a bunch of cases where values exist in plpgsql that won't actually be accessed. Say, e.g. return values from queries that are then conditionally returned and such. I'm not sure it's possible to do anything about that. Expanding only in cases where it'd be beneficial is going to be hard. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers