Re: [HACKERS] path toward faster partition pruning

David Rowley Fri, 23 Mar 2018 00:39:07 -0700

On 21 March 2018 at 00:07, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote:
> Attached is further revised version.


Hi Amit,

Thanks for sending the v38 patch.

I started reviewing this, but I just ended up starting to hack at the
patch instead. It's still got quite a bit of work to be done as I
think, unfortunately,  the cross type stuff is still pretty broken.
There's not really any sort of checking done to make sure you've found
a valid cross type hash or compare function in the code which results
in errors like:

create table hashp (a int, b numeric) partition by hash(a,b);
create table hashp1 partition of hashp for values with (modulus 4, remainder 0);
create table hashp2 partition of hashp for values with (modulus 4, remainder 1);
create table hashp3 partition of hashp for values with (modulus 4, remainder 2);
create table hashp4 partition of hashp for values with (modulus 4, remainder 3);
explain select * from hashp where a = 3::smallint and b = 1.0;
ERROR:  cache lookup failed for function 0

I'm not really sure if this should be a matter of doing an if
(!OidIsValid(new_supfuncid)) return false; I think the
context->partsupfunc must be pretty broken in cases like:

create table listp (a bigint) partition by list(a);
create table listp1 partition of listp for values in(1);
select * from listp where a <> 1::smallint and a <> 1::bigint;

The current patch simply just remembers the last comparison function
for comparing int8 to int4 and uses that one for the int8 to int2
comparison too.

Probably we need to cache the comparison function's Oid in with the
Expr in the step and use the correct one each time. I'm unsure of how
the fmgr info should be cached, but looks like it certainly cannot be
cached in the context in an array per partition key. I've so far only
thought some sort of hash table, but I'm sure there must be a much
better way to do this.

I started hacking it partition.c and ended up changing quite a few
things. I changed get_partitions_for_keys into 3 separate functions,
one for hash, list and range and tidied a few things up in that area.
There were a few bugs, for example passing the wrong value for the
size of the array into get_partitions_excluded_by_ne_datums.

I also changed how the Bitmapsets are handled in the step functions
and got rid of the Noop step type completely. I also got rid of the
passing of the srcparts into these functions. I think Roberts idea is
to process the steps in isolation and just combine the partitions
matching each step.

It would be great if we could coordinate our efforts here. I'm posting
this patch now just in case you're working or about to work on this.

In the meantime, I'll continue to drip feed cleanup patches. I'll try
to start writing some comments too, once I figure a few things out...

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

v38_drowley_delta1.patch
Description: Binary data

Re: [HACKERS] path toward faster partition pruning

Reply via email to