On Mon, 26 Aug 2019 at 19:46, Fabien COELHO <coe...@cri.ensmp.fr> wrote:
> > Hello Rafia, > > >> CREATE TABLE Stuff (...) > >> PARTITION BY [HASH | RANGE | LIST] (…) > >> DO NONE -- this is the default > >> DO [IMMEDIATE|DEFERRED] USING (…) > >> > >> Where the USING part would be generic keword value pairs, eg: > >> > >> For HASH: (MODULUS 8) and/or (NPARTS 10) > >> > >> For RANGE: (START '1970-01-01', STOP '2020-01-01', INCREMENT '1 year') > >> and/or (START 1970, STOP 2020, NPARTS 50) > >> > >> And possibly for LIST: (IN (…), IN (…), …), or possibly some other > >> keyword. > >> > >> The "DEFERRED" could be used as an open syntax for dynamic partitioning, > >> if later someone would feel like doing it. > >> > > ISTM that "USING" is better than "WITH" because WITH is already used > >> specifically for HASH and other optional stuff in CREATE TABLE. > >> > >> The text constant would be interpreted depending on the partitioning > >> expression/column type. > >> > >> Any opinion about the overall approach? > > > I happen to start a similar discussion [1] being unaware of this one > > and there Ashutosh Sharma talked about interval partitioning in Oracle. > > Looking > > closely it looks like we can have this automatic partitioning more > > convenient by having something similar. Basically, it is creating > > partitions on demand or lazy partitioning. > > Yep, the "what" of dynamic partitioning is more or less straightforward, > along the line you are describing. > > For me there are really two questions: > > - having a extendable syntax, hence the mail I sent, which would cover > both automatic static & dynamic partitioning and their parameters, > given that we already have manual static, automatic static should > be pretty easy. > > - implementing the stuff, with limited performance impact if possible > for the dynamic case, which is non trivial. > > > To explain a bit more, let's take range partition for example, first > > parent table is created and it's interval and start and end values are > > specified and it creates only the parent table just like it works today. > > > Now, if there comes a insertion that does not belong to the existing (or > > any, in the case of first insertion) partition(s), then the > > corresponding partition is created, > > Yep. Now, you also have to deal with race conditions issues, i.e. two > parallel session inserting tuples that must create the same partition, and > probably you would like to avoid a deadlock. > > Hmmm, that shouldn't be very hard. Postgres handles many such things and I think mostly by a mutex guarded shared memory structure. E.g. we can have a shared memory structure associated with the parent table holding the information of all the available partitions, and keep this structure guarded by mutex. Anytime a new partition has to be created the relevant information is first entered in this structure before actually creating it. > I think it is extensible to other partitioning schemes as well. Also it > > is likely to have a positive impact on the queries, because there will > > be required partitions only and would not require to educate > > planner/executor about many empty partitions. > > Yep, but it creates other problems to solve… > > Isn't it always the case. :) -- Regards, Rafia Sabih