Re: [PERFORM] cluster index on a table

2009-07-17 Thread Kevin Grittner
Scara Maccai wrote: >> Be sure to include an ORDER BY clause. > Isn't it going to be much slower? It might be; you'd have to test to know for sure. > The important thing is that most of the records are stored "close" > enough one to the other in the right order. Then, yeah, it might not

Re: [PERFORM] cluster index on a table

2009-07-17 Thread Scara Maccai
> Be sure to include an ORDER BY clause.  For > example, if someone else starts a query which the planner > determines > is best handled with a table scan, and that is still > running when you > issue your INSERT/SELECT, your query will join the current > scan at > it's point of progress, and "wra

Re: [PERFORM] cluster index on a table

2009-07-17 Thread Kevin Grittner
Scara Maccai wrote: > - create table mytable as select * from where time > (this gets all the data of last month ordered in the > "almost" correct order, because all the single tables were > clustered) Be sure to include an ORDER BY clause. Without that, there is no guarantee that even two

Re: [PERFORM] cluster index on a table

2009-07-17 Thread ph...@apra.asso.fr
Hi all, >On Wed, Jul 15, 2009 at 10:36 PM, Scott Marlowe >wrote: I'd love to see it. > > +1 for index organized tables > >--Scott +1 also for me... I am currently working for a large customer who is migrating his main application towards PostgreSQL, this application currently using DB2 an

Re: [PERFORM] cluster index on a table

2009-07-17 Thread Scara Maccai
> As Kevin said, there's no guarantee that tuples will be > read back > in the order you inserted them. Ok, didn't know that > A better option you might consider is to use a separate > table for the > re-ordered tuples. > You could even do this using partitions Problem is I'm already using par

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scara Maccai
> You might be better off > with a CLUSTER on > some index.  I can't: table is too big, can't lock it for minutes; that's why I wanted to cluster it "one day at a time". -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Justin Pitts
ISTR that is the approach that MSSQL follows. Storing the full tuple in an index and not even having a data only page would also be an interesting approach to this (and perhaps simpler than a separate index file and data file if trying to keep the data in the order of the index). -- S

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Justin Pitts
According to the books online http://msdn.microsoft.com/en-us/library/ms177443.aspx : "In a clustered index, the leaf nodes contain the data pages of the underlying table." Which agrees with your assertion. From a performance perspective, it DOES work very well. Which is why I keep h

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Justin Pitts
Is there any interest in adding that (continual/automatic cluster order maintenance) to a future release? On Wed, Jul 15, 2009 at 8:33 PM, Scott Carey wrote: > If you have a lot of insert/update/delete activity on a table fillfactor can > help. > > I don’t believe that postgres will try and mainta

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
On 7/16/09 5:27 PM, "Greg Stark" wrote: > On Fri, Jul 17, 2009 at 1:02 AM, Scott Carey wrote: >> Indexes would point to a heap page for normal tables and clustered index >> pages for clustered tables.  When new versions of data come in, it may point >> to new clustered index pages, just like the

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
On Fri, Jul 17, 2009 at 1:02 AM, Scott Carey wrote: > Indexes would point to a heap page for normal tables and clustered index > pages for clustered tables.  When new versions of data come in, it may point > to new clustered index pages, just like they currently get modified to point > to new heap

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
On 7/16/09 1:49 PM, "Greg Stark" wrote: > On Thu, Jul 16, 2009 at 9:06 PM, Scott Carey wrote: >> Keep the old page around or a copy of it that old transactions reference? >> Just more Copy on Write. >> How is that different from a nested loop on an index scan/seek currently? >> Doesn't an old tr

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Ibrahim Harrani
Hi Scott, Which fillfactor is better 70, 80 or another value? Thanks. Thanks On Thu, Jul 16, 2009 at 3:33 AM, Scott Carey wrote: > If you have a lot of insert/update/delete activity on a table fillfactor can > help. > > I don’t believe that postgres will try and maintain the table in the cluste

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
> Scara Maccai wrote: > >> What am I doing wrong? I didn't exactly follow the full sequence but it sounded like what's happening is that Postgres is noticing all these empty pages from earlier deletes and reusing that space. That's what it's designed to do. As Kevin said, there's no guarantee tha

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 9:06 PM, Scott Carey wrote: > Keep the old page around or a copy of it that old transactions reference? > Just more Copy on Write. > How is that different from a nested loop on an index scan/seek currently? > Doesn't an old transaction have to reference an old heap page thro

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Kevin Grittner
Scara Maccai wrote: > What am I doing wrong? > [function which uses INSERT/UPDATE/DELETE statements to try to force > order of rows in heap] You seem to be assuming that the rows will be in the table in the sequence of your inserts. You might be better off with a CLUSTER on some index. (The

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
On 7/16/09 12:46 PM, "Greg Stark" wrote: > On Thu, Jul 16, 2009 at 8:18 PM, Scott Carey wrote: >> " Each index row in the nonclustered index contains the nonclustered key >> value and a row locator. This locator points to the data row in the >> clustered index or heap having the key value." >>

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 8:18 PM, Scott Carey wrote: > " Each index row in the nonclustered index contains the nonclustered key > value and a row locator. This locator points to the data row in the > clustered index or heap having the key value." > > That sort of model should work with MVCC and even

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
Yes, it seems as though the whole tuple is entirely in the index if it is clustered. From : http://msdn.microsoft.com/en-us/library/ms177484.aspx " Each index row in the nonclustered index contains the nonclustered key value and a row locator. This locator points to the data row in the clustered

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Kevin Grittner
Scott Carey wrote: > I could be wrong, but I think MSSQL only keeps the data specified in > the index in the index, and the remaining columns in the data. Unless it has changed recently, an MS SQL Server clustered index is the same as the Sybase implementation: all data for the tuple is stored

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
I could be wrong, but I think MSSQL only keeps the data specified in the index in the index, and the remaining columns in the data. That is, if there is a clustered index on a table on three columns out of five, those three columns in the index are stored in the index, while the other two are in a

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
Either true Index Organized Tables or Clustered Indexes would be very useful for a variety of table/query types. The latter seems more difficult with Postgres' MVCC model since it requires data to be stored in the index that is authoritative. Storing the full tuple in an index and not even having

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scara Maccai
>   +1 for index organized tables  +1 I have a table: CREATE TABLE mytab ( "time" timestamp without time zone NOT NULL, ne_id integer NOT NULL, values integer, CONSTRAINT mytab_pk PRIMARY KEY (ne_id, "time"), CONSTRAINT mytab_ne_id_key UNIQUE ("time", ne_id) } The table is written eve

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Mead
On Wed, Jul 15, 2009 at 10:36 PM, Scott Marlowe wrote: > I'd love to see it. +1 for index organized tables --Scott

Re: [PERFORM] cluster index on a table

2009-07-15 Thread Scott Marlowe
I'd love to see it. On Wed, Jul 15, 2009 at 8:17 PM, Justin Pitts wrote: > Is there any interest in adding that (continual/automatic cluster > order maintenance) to a future release? > > On Wed, Jul 15, 2009 at 8:33 PM, Scott Carey wrote: >> If you have a lot of insert/update/delete activity on a

Re: [PERFORM] cluster index on a table

2009-07-15 Thread Scott Carey
If you have a lot of insert/update/delete activity on a table fillfactor can help. I don't believe that postgres will try and maintain the table in the cluster order however. On 7/15/09 8:04 AM, "Ibrahim Harrani" wrote: Hi, thanks for your suggestion. Is there any benefit of setting fillfac

Re: [PERFORM] cluster index on a table

2009-07-15 Thread Ibrahim Harrani
Hi, thanks for your suggestion. Is there any benefit of setting fillfactor to 70 or 80 on this table? On Wed, Jun 24, 2009 at 8:42 PM, Scott Marlowe wrote: > As another poster pointed out, you cluster on ONE index and one index > only.  However, you can cluster on a multi-column index. > -- S

Re: [PERFORM] cluster index on a table

2009-06-24 Thread Scott Marlowe
As another poster pointed out, you cluster on ONE index and one index only. However, you can cluster on a multi-column index. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] cluster index on a table

2009-06-24 Thread Kenneth Marshall
Clustering reorganizes the layout of a table according to the ordering of a SINGLE index. This will place items that are adjacent in the index adjacent in the heap. So you need to cluster on the index that will help the locality of reference for the queries which will benefit you the most. Executio

[PERFORM] cluster index on a table

2009-06-24 Thread Ibrahim Harrani
Hello, I have a table like following. To increase the performance of this table, I would like to create CLUSTER. First, Which index should I use on this table for CLUSTER? Secondly, Can I create multiple CLUSTER on the same table? I will appreciate, if you can suggest other options to increase