Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Gaetano Mendola
Greg Stark wrote:
The discussions before talked about a mechanism to try to place new 
> tuples as close as possible to the proper index position.
Means this that an index shall have a "fill factor" property, similar to
Informix one ?
From the manual:
The FILLFACTOR option takes effect only when you build an index on a table
that contains more than 5,000 rows and uses more than 100 table pages, when
you create an index on a fragmented table, or when you create a fragmented
index on a nonfragmented table.
Use the FILLFACTOR option to provide for expansion of an index at a later
date or to create compacted indexes.
When the index is created, the database server initially fills only that
percentage of the nodes specified with the FILLFACTOR value.
# Providing a Low Percentage Value
If you provide a low percentage value, such as 50, you allow room for growth
in your index. The nodes of the index initially fill to a certain percentage and
contain space for inserts. The amount of available space depends on the
number of keys in each page as well as the percentage value.
For example, with a 50-percent FILLFACTOR value, the page would be half
full and could accommodate doubling in size. A low percentage value can
result in faster inserts and can be used for indexes that you expect to grow.
# Providing a High Percentage Value
If you provide a high percentage value, such as 99, your indexes are
compacted, and any new index inserts result in splitting nodes. The
maximum density is achieved with 100 percent. With a 100-percent
FILLFACTOR value, the index has no room available for growth; any
additions to the index result in splitting the nodes.
A 99-percent FILLFACTOR value allows room for at least one insertion per
node. A high percentage value can result in faster selects and can be used for
indexes that you do not expect to grow or for mostly read-only indexes.

Regards
Gaetano Mendola


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Query performance issue with 8.0.0beta1

2004-08-27 Thread Russell Smith
7.4.2
>  Aggregate  (cost=46817.89..46817.89 rows=1 width=0) (actual time=401.216..401.217 
> rows=1 loops=1)
>->  Index Scan using snsdata_codpar on "SNS_DATA"  (cost=0.00..46817.22 rows=268 
> width=0) (actual time=165.948..400.258 rows=744 loops=1)
>  Index Cond: (("Cod_Par")::text = '17476'::text)
>  Filter: (("Data_Arrivo_Campione" >= '2004-01-01 00:00:00'::timestamp 
> without time zone) AND ("Data_Arrivo_Campione" <= '2004-01-31 23:59:59'::timestamp 
> without time zone))
>  Total runtime: 401.302 ms
> 
Row counts are out by a factor of 3, on the low side. so the planner will guess index 
is better, which it is.

> ***while on 8.0.0***
>  Aggregate  (cost=93932.91..93932.91 rows=1 width=0) (actual 
> time=14916.371..14916.371 rows=1 loops=1)
>->  Seq Scan on "SNS_DATA"  (cost=0.00..93930.14 rows=1108 width=0) (actual 
> time=6297.152..14915.330 rows=744 loops=1)
>  Filter: (("Data_Arrivo_Campione" >= '2004-01-01 00:00:00'::timestamp 
> without time zone) AND ("Data_Arrivo_Campione" <= '2004-01-31 23:59:59'::timestamp 
> without time zone) AND (("Cod_Par")::text = '17476'::text))
>  Total runtime: 14916.935 ms
Planner guesses that 1108 row should be returned, which is out by less, but on the 
high side.
Big question is given there are 2M rows, why does returning 1108 rows, less than 1% 
result in a sequence scan.
Usually the selectivity on the index is bad, try increasing the stats target on the 
column.

I know 8.0 has new stats anaylsis code, which could be effecting how it choses the 
plan. But it would still
require a good amount of stats to get it to guess correctly.

Increase stats and see if the times improve.

> 
> And I if disable the seqscan
> SET enable_seqscan = false;
> 
> I get the following:
> 
> Aggregate  (cost=158603.19..158603.19 rows=1 width=0) (actual 
> time=4605.862..4605.863 rows=1 loops=1)
>->  Index Scan using snsdata_codpar on "SNS_DATA"  (cost=0.00..158600.41 
> rows=1108 width=0) (actual time=2534.422..4604.865 rows=744 loops=1)
>  Index Cond: (("Cod_Par")::text = '17476'::text)
>  Filter: (("Data_Arrivo_Campione" >= '2004-01-01 00:00:00'::timestamp 
> without time zone) AND ("Data_Arrivo_Campione" <= '2004-01-31 23:59:59'::timestamp 
> without time zone))
>  Total runtime: 4605.965 ms
> 
> The total runtime is bigger (x10 !!) than the old one.
Did you run this multiple times, or is this the first time.  If it had to get the data 
off disk it will be slower.
Are you sure that it's coming from disk in this and the 7.4 case? or both from memory.
If 7.4 is from buffer_cache, or kernel_cache, and 8.0 is from disk you are likely to 
get A LOT slower.

> 
> The memory runtime parameters are 
> shared_buffer = 2048
> work_mem = sort_mem = 2048
> 
[ snip ]

> The table has 2M of records
> Can it be a datatype conversion issue?
That should not be an issue in 8.0, at least for the simple type conversions.  like 
int8 to int4.
I'm not 100% sure which ones were added, and which were not, but the query appears to 
cast everything correctly anyway.

> Can it be depend on the the type of restore (with COPY commands)?
Shouldn't and VACUUM FULL ANALYZE will make the table as small as possible.  The row 
order may be different
on disk, but the planner won't know that, and it's a bad plan causing the problem.

> I have no idea.
> 
> Thanks in advance!
> Reds
> 
Regards

Russell Smith.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Bruce Momjian

I had FILLFACTOR in the TODO list until just a few months ago, but
because no one had discussed it in 3-4 years, I removed the item.  I
have added mention now in the auto-cluster section because that actually
seems like the only good reason for a non-100% fillfactor.  I don't
think our ordinary btrees have enough of a penalty for splits to make a
non-full fillfactor worthwhile, but having a non-full fillfactor for
autocluster controls how often items have to be shifted around.

---

Gaetano Mendola wrote:
> Greg Stark wrote:
> 
> > The discussions before talked about a mechanism to try to place new 
>  > tuples as close as possible to the proper index position.
> 
> Means this that an index shall have a "fill factor" property, similar to
> Informix one ?
> 
>  From the manual:
> 
> 
> The FILLFACTOR option takes effect only when you build an index on a table
> that contains more than 5,000 rows and uses more than 100 table pages, when
> you create an index on a fragmented table, or when you create a fragmented
> index on a nonfragmented table.
> Use the FILLFACTOR option to provide for expansion of an index at a later
> date or to create compacted indexes.
> When the index is created, the database server initially fills only that
> percentage of the nodes specified with the FILLFACTOR value.
> 
> # Providing a Low Percentage Value
> If you provide a low percentage value, such as 50, you allow room for growth
> in your index. The nodes of the index initially fill to a certain percentage and
> contain space for inserts. The amount of available space depends on the
> number of keys in each page as well as the percentage value.
> For example, with a 50-percent FILLFACTOR value, the page would be half
> full and could accommodate doubling in size. A low percentage value can
> result in faster inserts and can be used for indexes that you expect to grow.
> 
> 
> # Providing a High Percentage Value
> If you provide a high percentage value, such as 99, your indexes are
> compacted, and any new index inserts result in splitting nodes. The
> maximum density is achieved with 100 percent. With a 100-percent
> FILLFACTOR value, the index has no room available for growth; any
> additions to the index result in splitting the nodes.
> A 99-percent FILLFACTOR value allows room for at least one insertion per
> node. A high percentage value can result in faster selects and can be used for
> indexes that you do not expect to grow or for mostly read-only indexes.
> 
> 
> 
> 
> Regards
> Gaetano Mendola
> 
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Josh Berkus
Bruce,

What happened to the B-Tree Table patch discussed on Hackers ad nauseum last 
winter?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Bruce Momjian
Josh Berkus wrote:
> Bruce,
> 
> What happened to the B-Tree Table patch discussed on Hackers ad nauseum last 
> winter?

I don't remember that. The only issue I remember is sorting btree index
by heap tid on creation.  We eventually got that into CVS for 8.0.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Adi Alurkar
Greetings,
I am not sure if this applies only to clustering but for storage in 
general,

IIRC  Oracle has 2 parameters that can be set at table creation :
from Oracle docs
PCTFREE integer :
Specify the percentage of space in each data block of the table, object 
table OID index, or partition reserved for future updates to the 
table's rows. The value of PCTFREE must be a value from 0 to 99. A 
value of 0 allows the entire block to be filled by inserts of new rows. 
The default value is 10. This value reserves 10% of each block for 
updates to existing rows and allows inserts of new rows to fill a 
maximum of 90% of each block.
PCTFREE has the same function in the PARTITION description and in the 
statements that create and alter clusters, indexes, materialized views, 
and materialized view logs. The combination of PCTFREE and PCTUSED 
determines whether new rows will be inserted into existing data blocks 
or into new blocks.

PCTUSED integer
Specify the minimum percentage of used space that Oracle maintains for 
each data block of the table, object table OID index, or 
index-organized table overflow data segment. A block becomes a 
candidate for row insertion when its used space falls below PCTUSED. 
PCTUSED is specified as a positive integer from 0 to 99 and defaults to 
40.
PCTUSED has the same function in the PARTITION description and in the 
statements that create and alter clusters, materialized views, and 
materialized view logs.
PCTUSED is not a valid table storage characteristic for an 
index-organized table (ORGANIZATION INDEX).
The sum of PCTFREE and PCTUSED must be equal to or less than 100. You 
can use PCTFREE and PCTUSED together to utilize space within a table 
more efficiently.

PostgreSQL could take some hints from the above.
On Aug 27, 2004, at 1:26 AM, Gaetano Mendola wrote:
Greg Stark wrote:
The discussions before talked about a mechanism to try to place new
> tuples as close as possible to the proper index position.
Means this that an index shall have a "fill factor" property, similar 
to
Informix one ?

From the manual:
The FILLFACTOR option takes effect only when you build an index on a 
table
that contains more than 5,000 rows and uses more than 100 table pages, 
when
you create an index on a fragmented table, or when you create a 
fragmented
index on a nonfragmented table.
Use the FILLFACTOR option to provide for expansion of an index at a 
later
date or to create compacted indexes.
When the index is created, the database server initially fills only 
that
percentage of the nodes specified with the FILLFACTOR value.

# Providing a Low Percentage Value
If you provide a low percentage value, such as 50, you allow room for 
growth
in your index. The nodes of the index initially fill to a certain 
percentage and
contain space for inserts. The amount of available space depends on the
number of keys in each page as well as the percentage value.
For example, with a 50-percent FILLFACTOR value, the page would be half
full and could accommodate doubling in size. A low percentage value can
result in faster inserts and can be used for indexes that you expect 
to grow.

# Providing a High Percentage Value
If you provide a high percentage value, such as 99, your indexes are
compacted, and any new index inserts result in splitting nodes. The
maximum density is achieved with 100 percent. With a 100-percent
FILLFACTOR value, the index has no room available for growth; any
additions to the index result in splitting the nodes.
A 99-percent FILLFACTOR value allows room for at least one insertion 
per
node. A high percentage value can result in faster selects and can be 
used for
indexes that you do not expect to grow or for mostly read-only indexes.


Regards
Gaetano Mendola


---(end of 
broadcast)---
TIP 7: don't forget to increase your free space map settings


--
Adi Alurkar (DBA sf.NET) <[EMAIL PROTECTED]>
1024D/79730470 A491 5724 74DE 956D 06CB  D844 6DF1 B972 7973 0470

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Query performance issue with 8.0.0beta1

2004-08-27 Thread andrew
Is it possible (to mix two threads) that you had CLUSTERed the table on the old 
database in a way that retrieved the records in this query faster?

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Query performance issue with 8.0.0beta1

2004-08-27 Thread Josh Berkus
Stefano,

> Hi, I have just installed 8.0.0beta1 and I noticed that some query are
> slower than 7.4.2 queries.

Seems unlikely.   How have you configured postgresql.conf?DID you 
configure it for the 8.0 database?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Bruce Momjian

But what is the advantage of non-full pages in Oracle?

---

Adi Alurkar wrote:
> Greetings,
> 
> I am not sure if this applies only to clustering but for storage in 
> general,
> 
> IIRC  Oracle has 2 parameters that can be set at table creation :
> from Oracle docs
> 
> PCTFREE integer :
> Specify the percentage of space in each data block of the table, object 
> table OID index, or partition reserved for future updates to the 
> table's rows. The value of PCTFREE must be a value from 0 to 99. A 
> value of 0 allows the entire block to be filled by inserts of new rows. 
> The default value is 10. This value reserves 10% of each block for 
> updates to existing rows and allows inserts of new rows to fill a 
> maximum of 90% of each block.
> PCTFREE has the same function in the PARTITION description and in the 
> statements that create and alter clusters, indexes, materialized views, 
> and materialized view logs. The combination of PCTFREE and PCTUSED 
> determines whether new rows will be inserted into existing data blocks 
> or into new blocks.
> 
> PCTUSED integer
> Specify the minimum percentage of used space that Oracle maintains for 
> each data block of the table, object table OID index, or 
> index-organized table overflow data segment. A block becomes a 
> candidate for row insertion when its used space falls below PCTUSED. 
> PCTUSED is specified as a positive integer from 0 to 99 and defaults to 
> 40.
> PCTUSED has the same function in the PARTITION description and in the 
> statements that create and alter clusters, materialized views, and 
> materialized view logs.
> PCTUSED is not a valid table storage characteristic for an 
> index-organized table (ORGANIZATION INDEX).
> The sum of PCTFREE and PCTUSED must be equal to or less than 100. You 
> can use PCTFREE and PCTUSED together to utilize space within a table 
> more efficiently.
> 
> PostgreSQL could take some hints from the above.
> 
> On Aug 27, 2004, at 1:26 AM, Gaetano Mendola wrote:
> 
> > Greg Stark wrote:
> >
> >> The discussions before talked about a mechanism to try to place new
> > > tuples as close as possible to the proper index position.
> >
> > Means this that an index shall have a "fill factor" property, similar 
> > to
> > Informix one ?
> >
> > From the manual:
> >
> >
> > The FILLFACTOR option takes effect only when you build an index on a 
> > table
> > that contains more than 5,000 rows and uses more than 100 table pages, 
> > when
> > you create an index on a fragmented table, or when you create a 
> > fragmented
> > index on a nonfragmented table.
> > Use the FILLFACTOR option to provide for expansion of an index at a 
> > later
> > date or to create compacted indexes.
> > When the index is created, the database server initially fills only 
> > that
> > percentage of the nodes specified with the FILLFACTOR value.
> >
> > # Providing a Low Percentage Value
> > If you provide a low percentage value, such as 50, you allow room for 
> > growth
> > in your index. The nodes of the index initially fill to a certain 
> > percentage and
> > contain space for inserts. The amount of available space depends on the
> > number of keys in each page as well as the percentage value.
> > For example, with a 50-percent FILLFACTOR value, the page would be half
> > full and could accommodate doubling in size. A low percentage value can
> > result in faster inserts and can be used for indexes that you expect 
> > to grow.
> >
> >
> > # Providing a High Percentage Value
> > If you provide a high percentage value, such as 99, your indexes are
> > compacted, and any new index inserts result in splitting nodes. The
> > maximum density is achieved with 100 percent. With a 100-percent
> > FILLFACTOR value, the index has no room available for growth; any
> > additions to the index result in splitting the nodes.
> > A 99-percent FILLFACTOR value allows room for at least one insertion 
> > per
> > node. A high percentage value can result in faster selects and can be 
> > used for
> > indexes that you do not expect to grow or for mostly read-only indexes.
> >
> >
> >
> >
> > Regards
> > Gaetano Mendola
> >
> >
> >
> >
> >
> > ---(end of 
> > broadcast)---
> > TIP 7: don't forget to increase your free space map settings
> >
> >
> --
> Adi Alurkar (DBA sf.NET) <[EMAIL PROTECTED]>
> 1024D/79730470 A491 5724 74DE 956D 06CB  D844 6DF1 B972 7973 0470
> 
> 
> 
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)-

Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Jeremy Dunn


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Bruce Momjian
> Sent: Friday, August 27, 2004 1:27 PM
> To: Adi Alurkar
> Cc: [EMAIL PROTECTED]
> Subject: Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?
> 
> 
> 
> But what is the advantage of non-full pages in Oracle?
> 

One advantage has to do with updates of variable-length columns, e.g.
varchars.

If the block is fully packed with data, an update to a varchar column
that makes the column wider, causes "row-chaining".  This means that a
portion of the row is stored in a different data block, which may be
somewhere completely different in the storage array.  Retrieving that
row (or even just that column from that row) as a unit may now require
additional disk seek(s).

Leaving some space for updates in each data block doesn't prevent this
problem completely, but mitigates it to a certain extent.  If for
instance a row is typically inserted with a null value for a varchar
column, but the application developer knows it will almost always get
updated with some value later on, then leaving a certain percentage of
empty space in each block allocated to that table makes sense.

Conversely, if you know that your data is never going to get updated
(e.g. a data warehousing application), you might specify to pack the
blocks as full as possible.  This makes for the most efficient data
retrieval performance.

- Jeremy


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Adi Alurkar
IIRC it it to reduce the "overflow" of data or what oracle calls  
chained rows. i.e if a table has variable length columns and 10 rows  
get inserted into a datapage, if this datapage is full and one of the  
variable length field gets updated the row will now "overflow" into  
another datapage, but if the datapage is created with an appropriate  
amount of free space the updated row will be stored in one single  
datapage.

On Aug 27, 2004, at 10:27 AM, Bruce Momjian wrote:
But what is the advantage of non-full pages in Oracle?
--- 


Adi Alurkar wrote:
Greetings,
I am not sure if this applies only to clustering but for storage in
general,
IIRC  Oracle has 2 parameters that can be set at table creation :
from Oracle docs
PCTFREE integer :
Specify the percentage of space in each data block of the table,  
object
table OID index, or partition reserved for future updates to the
table's rows. The value of PCTFREE must be a value from 0 to 99. A
value of 0 allows the entire block to be filled by inserts of new  
rows.
The default value is 10. This value reserves 10% of each block for
updates to existing rows and allows inserts of new rows to fill a
maximum of 90% of each block.
PCTFREE has the same function in the PARTITION description and in the
statements that create and alter clusters, indexes, materialized  
views,
and materialized view logs. The combination of PCTFREE and PCTUSED
determines whether new rows will be inserted into existing data blocks
or into new blocks.

PCTUSED integer
Specify the minimum percentage of used space that Oracle maintains for
each data block of the table, object table OID index, or
index-organized table overflow data segment. A block becomes a
candidate for row insertion when its used space falls below PCTUSED.
PCTUSED is specified as a positive integer from 0 to 99 and defaults  
to
40.
PCTUSED has the same function in the PARTITION description and in the
statements that create and alter clusters, materialized views, and
materialized view logs.
PCTUSED is not a valid table storage characteristic for an
index-organized table (ORGANIZATION INDEX).
The sum of PCTFREE and PCTUSED must be equal to or less than 100. You
can use PCTFREE and PCTUSED together to utilize space within a table
more efficiently.

PostgreSQL could take some hints from the above.
On Aug 27, 2004, at 1:26 AM, Gaetano Mendola wrote:
Greg Stark wrote:
The discussions before talked about a mechanism to try to place new
tuples as close as possible to the proper index position.
Means this that an index shall have a "fill factor" property, similar
to
Informix one ?
From the manual:
The FILLFACTOR option takes effect only when you build an index on a
table
that contains more than 5,000 rows and uses more than 100 table  
pages,
when
you create an index on a fragmented table, or when you create a
fragmented
index on a nonfragmented table.
Use the FILLFACTOR option to provide for expansion of an index at a
later
date or to create compacted indexes.
When the index is created, the database server initially fills only
that
percentage of the nodes specified with the FILLFACTOR value.

# Providing a Low Percentage Value
If you provide a low percentage value, such as 50, you allow room for
growth
in your index. The nodes of the index initially fill to a certain
percentage and
contain space for inserts. The amount of available space depends on  
the
number of keys in each page as well as the percentage value.
For example, with a 50-percent FILLFACTOR value, the page would be  
half
full and could accommodate doubling in size. A low percentage value  
can
result in faster inserts and can be used for indexes that you expect
to grow.

# Providing a High Percentage Value
If you provide a high percentage value, such as 99, your indexes are
compacted, and any new index inserts result in splitting nodes. The
maximum density is achieved with 100 percent. With a 100-percent
FILLFACTOR value, the index has no room available for growth; any
additions to the index result in splitting the nodes.
A 99-percent FILLFACTOR value allows room for at least one insertion
per
node. A high percentage value can result in faster selects and can be
used for
indexes that you do not expect to grow or for mostly read-only  
indexes.


Regards
Gaetano Mendola


---(end of
broadcast)---
TIP 7: don't forget to increase your free space map settings

--
Adi Alurkar (DBA sf.NET) <[EMAIL PROTECTED]>
1024D/79730470 A491 5724 74DE 956D 06CB  D844 6DF1 B972 7973 0470

---(end of  
broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org
--
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup. 

Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Bruce Momjian
Adi Alurkar wrote:
> IIRC it it to reduce the "overflow" of data or what oracle calls  
> chained rows. i.e if a table has variable length columns and 10 rows  
> get inserted into a datapage, if this datapage is full and one of the  
> variable length field gets updated the row will now "overflow" into  
> another datapage, but if the datapage is created with an appropriate  
> amount of free space the updated row will be stored in one single  
> datapage.

Agreed.  What I am wondering is with our system where every update gets
a new row, how would this help us?  I know we try to keep an update on
the same row as the original, but is there any significant performance
benefit to doing that which would offset the compaction advantage?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Agreed.  What I am wondering is with our system where every update gets
> a new row, how would this help us?  I know we try to keep an update on
> the same row as the original, but is there any significant performance
> benefit to doing that which would offset the compaction advantage?

Because Oracle uses overwrite-in-place (undoing from an UNDO log on
transaction abort), while we always write a whole new row, it would take
much larger PCTFREE wastage to get a useful benefit in PG than it does
in Oracle.  That wastage translates directly into increased I/O costs,
so I'm a bit dubious that we should assume there is a win to be had here
just because Oracle offers the feature.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] Why those queries do not utilize indexes?

2004-08-27 Thread Artimenko Igor
Hi everybody!

Here is my queries:

1. explain SELECT * FROM messageinfo WHERE user_id = CAST( 2 AS BIGINT ) and 
msgstatus = CAST(
0 AS smallint );

2. explain SELECT * FROM messageinfo WHERE messageinfo.user_id = 2::int8 and 
msgstatus =
0::smallint;

In both cases Explain command shows:
1. Sequential search and very high cost if set enable_seqscan to on;
Seq scan on messageinfo ( cost=0.00..24371.30, rows =36802 )

2. Index scan but even bigger cost if set enable_seqscan to off;
Index “messagesStatus” on messageinfo ( Cost=0.00..27220.72, rows=36802 )

messageinfo table has 200 records which meet this criteria and 662420 in total:

CREATE TABLE messageinfo
(
  user_id int8 NOT NULL,
  msgstatus int2 NOT NULL DEFAULT (0)::smallint,
  receivedtime timestamp NOT NULL DEFAULT now(),
  …
  msgread bool DEFAULT false,
  …
  CONSTRAINT "$1" FOREIGN KEY (user_id) REFERENCES users (id) ON UPDATE CASCADE ON 
DELETE CASCADE,
  ) 
WITH OIDS;

CREATE INDEX msgstatus
  ON messageinfo
  USING btree
  (user_id, msgstatus);

CREATE INDEX "messagesStatus"
  ON messageinfo
  USING btree
  (msgstatus);

CREATE INDEX msgread
  ON messageinfo
  USING btree
  (user_id, msgread);

CREATE INDEX "receivedTime"
  ON messageinfo
  USING btree
  (receivedtime);


MY QUESTIONS ARE:

1.  Should I afraid of high cost indexes? Or query will still be very efficient?

2.  Postgres does not use the index I need. For my data sets it’s always msgstatus 
index is
narrowest compare with ‘messagesStatus’ one. Is any way to “enforce” to use a 
particular index?
What’s the logic when Postgres chooses one index compare with the other.

3.  I can change db structure to utilize Postgres specifics if you can tell them 
to me.

4.  Also, originally I had “messagesStatus” index having 2 components ( 
“msgstatus”, “user_id” ).
But query SELECT * FROM messageinfo WHERE msgstatus = 0 did not utilize indexes in 
this case. It
only worked if both index components are in WHERE part. So I have to remove 2-nd 
component
“user_id” from messagesStatus index even I wanted it. Is any way that where clause has 
only 1-st
component but index is utilized? 

Igor Artimenko



__
Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.
http://promotions.yahoo.com/new_mail 

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Greg Stark
Bruce Momjian <[EMAIL PROTECTED]> writes:

> Agreed.  What I am wondering is with our system where every update gets
> a new row, how would this help us?  I know we try to keep an update on
> the same row as the original, but is there any significant performance
> benefit to doing that which would offset the compaction advantage?

Hm. Posit a system where all transactions are short updates executed in
autocommit mode. 

In such a system as soon as a transaction commits it would take a very short
time before the previous record was a dead tuple.

If every backend kept a small list of tuples it had marked deleted and
whenever it was idle checked to see if they were dead yet, it might avoid much
of the need for vacuum. And in such a circumstance I think you wouldn't need
more than a pctfree of 50% even on a busy table. Every tuple would need about
one extra slot.

This would only be a reasonable idea if a) if the list of potential dead
tuples is short and if it overflows it just forgets them leaving them for
vacuum to deal with. and b) It only checks the potentially dead tuples when
the backend is otherwise idle.

Even so it would be less efficient than a batch vacuum, and it would be taking
up i/o bandwidth (to maintain indexes even if the heap buffer is in ram), even
if that backend is idle it doesn't mean other backends couldn't have used that
i/o bandwidth.

But I think it would deal with a lot of the complaints about vacuum and it
would make it more feasible to use a pctfree parameter to make clustering more
effective.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Greg Stark

Bruce Momjian <[EMAIL PROTECTED]> writes:

> but is there any significant performance benefit to doing that which would
> offset the compaction advantage?

Just as a side comment. Setting PCTFREE 0 PCTUSED 100 on tables that have no
updates on them has an astonishingly big effect on speed. So the penalty for
leaving some space free really is substantial.

I think the other poster is right. Oracle really needs pctfree because of the
way it handles updates. Postgres doesn't really need as much because it
doesn't try to squeeze the new tuple in the space the old one took up. If it
doesn't fit on the page the worst that happens is it has to store it on some
other page, whereas oracle has to do its strange row chaining thing.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Bruce Momjian
Greg Stark wrote:
> 
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> 
> > but is there any significant performance benefit to doing that which would
> > offset the compaction advantage?
> 
> Just as a side comment. Setting PCTFREE 0 PCTUSED 100 on tables that have no
> updates on them has an astonishingly big effect on speed. So the penalty for
> leaving some space free really is substantial.
> 
> I think the other poster is right. Oracle really needs pctfree because of the
> way it handles updates. Postgres doesn't really need as much because it
> doesn't try to squeeze the new tuple in the space the old one took up. If it
> doesn't fit on the page the worst that happens is it has to store it on some
> other page, whereas oracle has to do its strange row chaining thing.

Oracle also does that chain thing so moving updates to different pages
might have more of an impact than it does on PostgreSQL.  We have chains
too but just for locking.  Not sure on Oracle.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Why those queries do not utilize indexes?

2004-08-27 Thread Dennis Bjorklund
On Fri, 27 Aug 2004, Artimenko Igor wrote:

> 1. Sequential search and very high cost if set enable_seqscan to on;
> Seq scan on messageinfo ( cost=0.00..24371.30, rows =36802 )
> 
> 2. Index scan but even bigger cost if set enable_seqscan to off;
> Index “messagesStatus” on messageinfo ( Cost=0.00..27220.72, rows=36802 )

So pg thinks that a sequential scan will be a little bit faster (The cost 
is a little bit smaller). If you compare the actual runtimes maybe you 
will see that pg was right. In this case the cost is almost the same so 
the runtime is probably almost the same.

When you have more data pg will start to use the index since then it will 
be faster to use an index compared to a seq. scan.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Anyone familiar with Apple Xserve RAID

2004-08-27 Thread Vivek Khera
> "JB" == Josh Berkus <[EMAIL PROTECTED]> writes:

JB> Guys,
>> the XServe/XRaid comes with FibreChannel

JB> I stand corrected.  That should help things some; it makes it more
JB> of a small tradeoff between performance and storage size for the
JB> drives.


it is fibre channel to the host.  the internals are still IDE drives
with possibly multiple controllers inside the enclosure.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Why those queries do not utilize indexes?

2004-08-27 Thread Artimenko Igor
I could force Postgres to use the best index by removing condition "msgstatus = CAST( 
0 AS
smallint );" from WHERE clause & set enable_seqscan to off;
Total runtime in this case dropped from 1883 ms ( sequential reads ) to 1.598 ms ( 
best index ).

But unfortunatelly It does not resolve my problem. I can not remove above condition. I 
need to
find a way to use whole condition "WHERE user_id = CAST( 2 AS BIGINT ) and 
msgstatus = CAST( 0
AS smallint );" and still utilyze index.  

Yes you are right. Using "messagesStatus" index is even worse for my data set then 
sequential
scan.

Igor Artimenko

--- Dennis Bjorklund <[EMAIL PROTECTED]> wrote:

> On Fri, 27 Aug 2004, Artimenko Igor wrote:
> 
> > 1. Sequential search and very high cost if set enable_seqscan to on;
> > Seq scan on messageinfo ( cost=0.00..24371.30, rows =36802 )
> > 
> > 2. Index scan but even bigger cost if set enable_seqscan to off;
> > Index “messagesStatus” on messageinfo ( Cost=0.00..27220.72, rows=36802 )
> 
> So pg thinks that a sequential scan will be a little bit faster (The cost 
> is a little bit smaller). If you compare the actual runtimes maybe you 
> will see that pg was right. In this case the cost is almost the same so 
> the runtime is probably almost the same.
> 
> When you have more data pg will start to use the index since then it will 
> be faster to use an index compared to a seq. scan.
> 
> -- 
> /Dennis Björklund
> 
> 




___
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
http://promotions.yahoo.com/goldrush

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Gaetano Mendola
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
>
>>Agreed.  What I am wondering is with our system where every update gets
>>a new row, how would this help us?  I know we try to keep an update on
>>the same row as the original, but is there any significant performance
>>benefit to doing that which would offset the compaction advantage?
>
>
> Because Oracle uses overwrite-in-place (undoing from an UNDO log on
> transaction abort), while we always write a whole new row, it would take
> much larger PCTFREE wastage to get a useful benefit in PG than it does
> in Oracle.  That wastage translates directly into increased I/O costs,
> so I'm a bit dubious that we should assume there is a win to be had here
> just because Oracle offers the feature.
Mmmm. Consider this scenario:
ctid   datas
(0,1)   yyy-xxx
(0,2)    EMPTY 
(0,3)    EMPTY 
(0,4)    EMPTY 
(0,5)    EMPTY 
(0,6)   yyy-xxx
(0,7)    EMPTY 
 EMPTY 
(0,11)  yyy-xxx
the row (0,2) --> (0,5) are space available for the (0,1) updates.
This will help a table clustered ( for example ) to mantain his
own correct cluster order.

Regards
Gaetano Mendola






---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Bruce Momjian wrote:
| Gaetano Mendola wrote:
|
|>Tom Lane wrote:
|>
|> > Bruce Momjian <[EMAIL PROTECTED]> writes:
|> >
|> >>Agreed.  What I am wondering is with our system where every update gets
|> >>a new row, how would this help us?  I know we try to keep an update on
|> >>the same row as the original, but is there any significant performance
|> >>benefit to doing that which would offset the compaction advantage?
|> >
|> >
|> > Because Oracle uses overwrite-in-place (undoing from an UNDO log on
|> > transaction abort), while we always write a whole new row, it would take
|> > much larger PCTFREE wastage to get a useful benefit in PG than it does
|> > in Oracle.  That wastage translates directly into increased I/O costs,
|> > so I'm a bit dubious that we should assume there is a win to be had here
|> > just because Oracle offers the feature.
|>
|>Mmmm. Consider this scenario:
|>
|>ctid   datas
|>(0,1)   yyy-xxx
|>(0,2)    EMPTY 
|>(0,3)    EMPTY 
|>(0,4)    EMPTY 
|>(0,5)    EMPTY 
|>(0,6)   yyy-xxx
|>(0,7)    EMPTY 
|> EMPTY 
|>(0,11)  yyy-xxx
|>
|>
|>the row (0,2) --> (0,5) are space available for the (0,1) updates.
|>This will help a table clustered ( for example ) to mantain his
|>own correct cluster order.
|
|
| Right.  My point was that non-full fill is valuable for us only when
| doing clustering, while for Oracle it is a win even in non-cluster cases
| because of the way they update in place.
Don't you think this will permit also to avoid extra disk seek and cache
invalidation? If you are updating the row (0,1) I think is less expensive
put the new version in (0,2) instead of thousand line far from that point.
Regards
Gaetano Mendola


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBL+kA7UpzwH2SGd4RAp6fAJ9rSs5xmTXsy4acUGcnCRTbEUCwrwCgo/o6
0JPtziuf1E/EGLaqjbPMV44=
=pIgX
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Equivalent praxis to CLUSTERED INDEX?

2004-08-27 Thread Bruce Momjian
Gaetano Mendola wrote:
> | Right.  My point was that non-full fill is valuable for us only when
> | doing clustering, while for Oracle it is a win even in non-cluster cases
> | because of the way they update in place.
> 
> Don't you think this will permit also to avoid extra disk seek and cache
> invalidation? If you are updating the row (0,1) I think is less expensive
> put the new version in (0,2) instead of thousand line far from that point.

It would, but does that outweigh the decreased I/O by having things more
densely packed?  I would think not.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org