Re: Creating partitions automatically at least on HASH?

Robert Eckhardt Mon, 15 Jul 2019 12:50:40 -0700

On Mon, Jul 15, 2019 at 10:54 AM Robert Haas <robertmh...@gmail.com> wrote:
>
> On Mon, Jul 15, 2019 at 1:29 AM Fabien COELHO <coe...@cri.ensmp.fr> wrote:
> > Hello pgdevs,
> >
> > sorry if this has been already discussed, but G did not yield anything
> > convincing about that.
> >
> > While looking at HASH partitioning and creating a few ones, it occured to
> > me that while RANGE and LIST partitions cannot be guessed easily, it would
> > be easy to derive HASH partitioned table for a fixed MODULUS, e.g. with
> >
> >    CREATE TABLE foo(...) PARTITION BY HASH AUTOMATIC (MODULUS 10);
> >    -- or some other syntax
> >
> > Postgres could derive statically the 10 subtables, eg named foo_$0$ to
> > foo_$1$.
> >
> > That would not be a replacement for the feature where one may do something
> > funny and doubtful like (MODULUS 2 REMAINDER 0, MODULUS 4 REMAINDER 1,
> > MODULUS 4 REMAINDER 3).
> >
> > The same declarative approach could eventually be considered for RANGE
> > with a fixed partition duration and starting and ending points.
> >
> > This would be a relief on the longer path of dynamically creating
> > partitions, but with lower costs than a dynamic approach.
>
> Yeah, I think something like this would be reasonable, but I think
> that the best syntax is not really clear.  We might want to look at
> how other systems handle this.


Greenplum has a syntax that covers some cases but not the hash case.

For range based partitions we have:

CREATE TABLE sales (id int, date date, amt decimal(10,2))
DISTRIBUTED BY (id)
PARTITION BY RANGE (date)
( START (date '2016-01-01') INCLUSIVE
   END (date '2017-01-01') EXCLUSIVE
   EVERY (INTERVAL '1 day') );

This is equivelant to the below so you can also declare and name each
partition individually. For example:

CREATE TABLE sales (id int, date date, amt decimal(10,2))
DISTRIBUTED BY (id)
PARTITION BY RANGE (date)
( PARTITION Jan16 START (date '2016-01-01') INCLUSIVE ,
  PARTITION Feb16 START (date '2016-02-01') INCLUSIVE ,
  PARTITION Mar16 START (date '2016-03-01') INCLUSIVE ,
  PARTITION Apr16 START (date '2016-04-01') INCLUSIVE ,
  PARTITION May16 START (date '2016-05-01') INCLUSIVE ,
  PARTITION Jun16 START (date '2016-06-01') INCLUSIVE ,
  PARTITION Jul16 START (date '2016-07-01') INCLUSIVE ,
  PARTITION Aug16 START (date '2016-08-01') INCLUSIVE ,
  PARTITION Sep16 START (date '2016-09-01') INCLUSIVE ,
  PARTITION Oct16 START (date '2016-10-01') INCLUSIVE ,
  PARTITION Nov16 START (date '2016-11-01') INCLUSIVE ,
  PARTITION Dec16 START (date '2016-12-01') INCLUSIVE
                  END (date '2017-01-01') EXCLUSIVE );

You can do similar things with numeric

CREATE TABLE rank (id int, rank int, year int, gender
char(1), count int)
DISTRIBUTED BY (id)
PARTITION BY RANGE (year)
( START (2006) END (2016) EVERY (1),
  DEFAULT PARTITION extra );

ENUM

CREATE TABLE rank (id int, rank int, year int, gender
char(1), count int )
DISTRIBUTED BY (id)
PARTITION BY LIST (gender)
( PARTITION girls VALUES ('F'),
  PARTITION boys VALUES ('M'),
  DEFAULT PARTITION other );

Also it supports multilevel partitioning using a PARTITION TEMPLATE
and SUBPARTITION TEMPLATE. The partitioning template ensures that the
structure at every level is the same.

CREATE TABLE p3_sales (id int, year int, month int, day int,
region text)
DISTRIBUTED BY (id)
PARTITION BY RANGE (year)
    SUBPARTITION BY RANGE (month)
       SUBPARTITION TEMPLATE (
        START (1) END (13) EVERY (1),
        DEFAULT SUBPARTITION other_months )
           SUBPARTITION BY LIST (region)
             SUBPARTITION TEMPLATE (
               SUBPARTITION usa VALUES ('usa'),
               SUBPARTITION europe VALUES ('europe'),
               SUBPARTITION asia VALUES ('asia'),
               DEFAULT SUBPARTITION other_regions )
( START (2002) END (2012) EVERY (1),
  DEFAULT PARTITION outlying_years );

-- Rob

Re: Creating partitions automatically at least on HASH?

Reply via email to