From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Nicolas Grilly
Sent: Wednesday, August 31, 2016 6:32 PM
To: Kenneth Marshall <k...@rice.edu>
Cc: Vick Khera <vi...@khera.org>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Clustered index to preserve data locality in a 
multitenant application?

On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall 
<k...@rice.edu<mailto:k...@rice.edu>> wrote:
We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.

In DB2, it seems possible to define a "clustering index" that determines how 
rows are physically ordered in the "table space" (the heap).

The documentation says: "When a table has a clustering index, an INSERT 
statement causes DB2 to insert the records as nearly as possible in the order 
of their index values."

It looks like a kind of "continuous CLUSTER/pg_repack". Is there something 
similar available or planned for PostgreSQL?


Don’t know about plans to implement clustered indexes in PostgreSQL.

Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap 
row is just stored on the leaf level of the index.
Oracle also has similar feature: IOT, Index Organized Table.

It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder 
to implement clustered index (with the heap row stored in the index leaf) 
because of the way how MVCC implemented: multiple row versions are stored in 
the table itself (e.g. Oracle for that purpose keeps table “clean” and stores 
multiple row versions in UNDO tablespace/segment).

Regards,
Igor Neyman


Reply via email to