On Mon, Apr 20, 2020 at 09:27:34PM -0400, James Coleman wrote:
Over in "execExprInterp() questions / How to improve scalar array op
expr eval?" [1] I'd mused about how we might be able to optimized
scalar array ops with OR'd semantics.

This patch implements a binary search for such expressions when the
array argument is a constant so that we can avoid needing to teach
expression execution to cache stable values or know when a param has
changed.

The speed-up for the target case can pretty impressive: in my
admittedly contrived and relatively unscientific test with a query in
the form:

select count(*) from generate_series(1,100000) n(i) where i in (<1000
random integers in the series>)

shows ~30ms for the patch versus ~640ms on master.


Nice improvement, although 1000 items is probably a bit unusual. The
threshold used in the patch (9 elements) seems a bit too low - what
results have you seen with smaller arrays?

Another idea - would a bloom filter be useful here, as a second
optimization? That is, for large arrays build s small bloom filter,
allowing us to skip even the binary search.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to