Andrew Dunstan <andrew.duns...@2ndquadrant.com> writes: > On 09/28/2017 01:11 PM, Tom Lane wrote: >>> I wonder if we need to do any benchmarking to assure ourselves that the >>> changes to ArrayCoerceExpr don't have a significant performance impact?
>> That would likely be a good idea, though I'm not very sure what or >> how to benchmark. > Some case where we only expect the current code to produce a single > ArrayCoerceExpr, I guess. say doing text[] -> int[] ? I spent some time looking into this. I settled on int4[] -> int8[] as the appropriate case to test, because int48() is about as cheap a cast function as we have. Q1 is the base case without a cast: select count(x) from (select array[i,i,i,i,i,i,i,i,i,i] as x from generate_series(1,10000000) i) ss; Q2 is same with a cast added: select count(x::int8[]) from (select array[i,i,i,i,i,i,i,i,i,i] as x from generate_series(1,10000000) i) ss; Q3 and Q4 are the same thing, but testing 100-element instead of 10-element arrays: select count(x) from (select array[ i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i ] as x from generate_series(1,10000000) i) ss; select count(x::int8[]) from (select array[ i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i, i,i,i,i,i,i,i,i,i,i ] as x from generate_series(1,10000000) i) ss; I get these query timings in a non-cassert build: HEAD Patch Q1 5453.235 ms 5440.876 ms Q2 9340.670 ms 10191.194 ms Q3 19078.457 ms 18967.279 ms Q4 48196.338 ms 58547.531 ms (Timings are reproducible to a few percent.) So that's a bit disappointing; the per-element overhead is clearly noticeably more than before. However, poking into it with "perf" gives some grounds for optimism; Q4's hotspots with the patch are Children Self Samples Command Shared Object Symbol + 33.44% 33.35% 81314 postmaster postgres [.] ExecInterpExpr + 21.88% 21.83% 53223 postmaster postgres [.] array_map + 15.19% 15.15% 36944 postmaster postgres [.] CopyArrayEls + 14.63% 14.60% 35585 postmaster postgres [.] ArrayCastAndSet + 6.07% 6.06% 14765 postmaster postgres [.] construct_md_array + 1.80% 1.79% 4370 postmaster postgres [.] palloc0 + 0.77% 0.77% 1883 postmaster postgres [.] AllocSetAlloc + 0.75% 0.74% 1815 postmaster postgres [.] int48 + 0.52% 0.52% 1276 postmaster postgres [.] advance_aggregates Surely we could get the amount of time spent in ExecInterpExpr down. One idea is to make a dedicated evalfunc for the case where the expression is just EEOP_CASE_TESTVAL + EEOP_FUNCEXPR[_STRICT], similar to the existing fast-path routines (ExecJust*). I've not actually tried to do that, but a reasonable guess is that it'd about halve that overhead, putting this case back on a par with the HEAD code. Also, I'd imagine that Andres' planned work on JIT-compiled expressions would put this back on par with HEAD, if not better, for installations using that. Also I believe that Andres has plans to revamp the CaseTestExpr mechanism, which might shave a few cycles as well. Right now there's an extra copy of each array datum + isnull value, because array_map sticks those into one pair of variables and then the EEOP_CASE_TESTVAL opcode just moves them somewhere else. It's reasonable to hope that we could redesign that so that array_map sticks the values straight into where they're needed, and then we need only the FUNCEXPR opcode, which'd be a great candidate for an ExecJust* fast-path. Assuming that that's going to happen for v11, I'm inclined to leave the optimization problem until the dust settles around CaseTestExpr. Does anyone feel that this can't be committed before that's addressed? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers