Re: Partitioning: issues/ideas (Was: Re: [HACKERS] On partitioning)

Amit Langote Wed, 21 Jan 2015 02:35:24 -0800

On 21-01-2015 AM 01:42, Robert Haas wrote:
> On Mon, Jan 19, 2015 at 8:48 PM, Amit Langote
> <langote_amit...@lab.ntt.co.jp> wrote:
>>>> Specifically, do we regard a partitions as pg_inherits children of its
>>>> partitioning parent?
>>>
>>> I don't think this is totally an all-or-nothing decision.  I think
>>> everyone is agreed that we need to not break things that work today --
>>> e.g. Merge Append.  What that implies for pg_inherits isn't altogether
>>> clear.
>>
>> One point is that an implementation may end up establishing the
>> parent-partition hierarchy somewhere other than (or in addition to)
>> pg_inherits. One intention would be to avoid tying partitioning scheme
>> to certain inheritance features that use pg_inherits. For example,
>> consider call sites of find_all_inheritors(). One notable example is
>> Append/MergeAppend which would be of interest to partitioning. We would
>> want to reuse that part of the infrastructure but we could might as well
>> write an equivalent, say find_all_partitions() which scans something
>> other than pg_inherits to get all partitions.
> 
> IMHO, there's little reason to avoid putting pg_inherits entries in
> for the partitions, and then this just works.  We can find other ways
> to make it work if that turns out to be better, but if we don't have
> one, there's no reason to complicate things.
>


Ok, I will go forward and stick to pg_inherits approach for now. Perhaps
the concerns I am expressing have other solutions that don't require
abandoning pg_inherits approach altogether.

>> Agree that some concrete idea of internal representation should help
>> guide the catalog structure. If we are going to cache the partitioning
>> info in relcache (which we most definitely will), then we should try to
>> make sure to consider the scenario of having a lot of partitioned tables
>> with a lot of individual partitions. It looks like it would be similar
>> to a scenarios where there are a lot of inheritance hierarchies. But,
>> availability of partitioning feature would definitely cause these
>> numbers to grow larger. Perhaps this is an important point driving this
>> discussion.
> 
> Yeah, it would be good if the costs of supporting, say, 1000
> partitions were negligible.
> 
>> A primary question for me about partition-pruning is when do we do it?
>> Should we model it after relation_excluded_by_constraints() and hence
>> totally plan-time? But, the tone of the discussion is that we postpone
>> partition-pruning to execution-time and hence my perhaps misdirected
>> attempts to inject it into some executor machinery.
> 
> It's useful to prune partitions at plan time, because then you only
> have to do the work once.  But sometimes you don't know enough to do
> it at plan time, so it's useful to do it at execution time, too.
> Then, you can do it differently for every tuple based on the actual
> value you have.  There's no point in doing 999 unnecessary relation
> scans if we can tell which partition the actual run-time value must be
> in.  But I think execution-time pruning can be a follow-on patch.  If
> you don't restrict the scope of the first patch as much as possible,
> you're not going to have much luck getting this committed.
> 

Ok, I will limit myself to focusing on following things at the moment:

* Provide syntax in CREATE TABLE to declare partition key
* Provide syntax in CREATE TABLE to declare a table as partition of a
partitioned table and values it contains
* Arrange to have partition key and values stored in appropriate
catalogs (existing or new)
* Arrange to cache partitioning info of partitioned tables in relcache

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Partitioning: issues/ideas (Was: Re: [HACKERS] On partitioning)

Reply via email to