Re: [HACKERS] On partitioning

Jim Nasby Fri, 05 Dec 2014 11:53:15 -0800

On 12/5/14, 1:22 PM, Jim Nasby wrote:

On 12/5/14, 3:42 AM, Amit Langote wrote:

>  I think you are right.  I think in this case we need something similar
>to column pg_index.indexprs which is of type pg_node_tree(which
>seems to be already suggested by Robert). So may be we can proceed
>with this type and see if any one else has better idea.

One point raised about/against pg_node_tree was the values represented therein 
would turn out to be too generalized to be used with advantage during planning. 
But, it seems we could deserialize it in advance back to the internal form 
(like an array of a struct) as part of the cached relation data. This overhead 
would only be incurred in case of partitioned tables. Perhaps this is what 
Robert suggested elsewhere.


In order to store a composite type in a catalog, we would need to have one field that has 
the typid of the composite, and the field that stores the actual composite data would 
need to be a "dumb" varlena that stores the composite HeapTupleHeader.


On further thought; if we disallow NULL as a partition boundary, we don't need 
a separate rowtype; we could just use the one associated with the relation 
itself. Presumably that would make comparing tuples to the relation list a lot 
easier.

I was hung up on how that would work in the case of ALTER TABLE, but we'd have 
the same problem with using pg_node_tree: if you alter a table in such a way 
that *might* affect your partitioning, you have to do some kind of revalidation 
anyway.

The other option would be to use some custom rowtype to store boundary values 
and have a method that can form a boundary tuple from a real one. Either way, I 
suspect this is better than frequently evaluating pg_node_trees.

There may be one other option. If range partitions are defined in terms of an 
expression that is different for every partition (ie: (substr(product_key, 1, 
4), date_trunc('month', sales_date))) then we could use a hash of that 
expression to identify a partition. In other words, range partitioning becomes 
a special case of hash partitioning. I do think we need a programmatic means to 
identify the range of an individual partition and hash won't solve that, but 
the performance of that case isn't critical so we could use pretty much 
whatever we wanted to there.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

Reply via email to