Re: [PERFORM] cluster index on a table

Scara Maccai Fri, 17 Jul 2009 00:23:24 -0700

> As Kevin said, there's no guarantee that tuples will be
> read back
> in the order you inserted them.


Ok, didn't know that

> A better option you might consider is to use a separate
> table for the
> re-ordered tuples. 
> You could even do this using partitions

Problem is I'm already using partions: I'm partitioning on a monthly basis. I 
want to avoid partitioning on a daily basis: I have 200 tables partitioned by 
month, 2 years of data. Partition them by day would mean 700*200 tables: what 
kind of performance impacts would it mean?


Does this other option make sense:

partition only "last month" by day; older months by month.
Day by day the tables of the current month gets clustered (say at 1.00AM next 
day).
Then, every 1st of the month, create a new table as 

- create table mytable as select * from <parent_table> where time <in last 
month> (this gets all the data of last month ordered in the "almost" correct 
order, because all the single tables were clustered)
- alter mytable  add constraint "time in last month" 
- alter mytable  inherit  <parent_table> 

and then drop last month's tables.

Is this even doable? I mean: between

- alter mytable  inherit  <parent_table> 
- drop last month's tables.

more than one table with the same constraint would inherit from the same table: 
that's fine unless someone can see the "change" before the "drop tables" part, 
but I guess this shouldn't be a problem if I use the serializable transaction 
level.

This way I could cluster the tables (not perfectly, since I would cluster data 
day by day, but it's enough) and still have few tables, say (31 for current 
month + 23 for the past 23 months) * 200.












-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] cluster index on a table

Reply via email to