from:"Martin Lesser"

[PERFORM] Vacuum becomes slow

2005-06-30 Thread Martin Lesser

Hi,

the time needed for a daily VACUUM on a table with about 28 mio records
increases from day to day. What's the best way to avoid this? A full
vacuum will probably take too much time, are there other ways to keep
vacuum performant?

The database was updated to postgres-8.0 on Jun 04 this year.

Between Jun 07 and Jun 30 the time vacuum needed increased from 683
seconds up to 1,663 seconds, the output is posted below. E.g. the time
for vacuuming the index of a text-field (i_ids_user) raised from 123 sec
to 668 secs. The increase happens each day so this is not a problem of
the last run. The number of records in the table in the same time only
increased from 27.5 mio to 28.9 mio, the number of records updated daily
is about 700,000 to 1,000,000.

Regards

Martin


| Tue Jun 7 04:07:17 CEST 2005 Starting
|   SET VACUUM_MEM=25; VACUUM ANALYZE VERBOSE t_ids

INFO:  vacuuming "public.t_ids"
INFO:  index "i_ids_score" now contains 4323671 row versions in 12414 pages
DETAIL:  493855 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.76s/5.44u sec elapsed 33.22 sec.
INFO:  index "i_ids_id" now contains 2752 row versions in 61515 pages
DETAIL:  960203 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 8.09s/24.93u sec elapsed 108.43 sec.
INFO:  index "i_ids_user" now contains 2752 row versions in 103172 pages
DETAIL:  960203 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 14.00s/39.65u sec elapsed 123.47 sec.
INFO:  "t_ids": removed 960203 row versions in 203369 pages
DETAIL:  CPU 22.88s/21.72u sec elapsed 294.22 sec.
INFO:  "t_ids": found 960203 removable, 2752 nonremovable row versions in 
208912 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 214149 unused item pointers.
0 pages are entirely empty.
CPU 53.02s/93.76u sec elapsed 643.46 sec.
INFO:  vacuuming "pg_toast.pg_toast_224670"
INFO:  index "pg_toast_224670_index" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  "pg_toast_224670": found 0 removable, 0 nonremovable row versions in 0 
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  analyzing "public.t_ids"
INFO:  "t_ids": scanned 9 of 208912 pages, containing 11846838 live rows 
and 0 dead rows; 9 rows in sample, 27499407 estimated total rows
VACUUM

| Tue Jun 7 04:18:40 CEST 2005 Job finished after 683 seconds



| Thu Jun 30 01:23:33 CEST 2005 Starting
|   SET VACUUM_MEM=25; VACUUM ANALYZE VERBOSE t_ids

INFO:  vacuuming "public.t_ids"
INFO:  index "i_ids_score" now contains 4460326 row versions in 29867 pages
DETAIL:  419232 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 4.58s/7.72u sec elapsed 368.13 sec.
INFO:  index "i_ids_id" now contains 28948643 row versions in 68832 pages
DETAIL:  795700 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 9.08s/25.29u sec elapsed 151.38 sec.
INFO:  index "i_ids_user" now contains 28948938 row versions in 131683 pages
DETAIL:  795700 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 20.10s/43.27u sec elapsed 668.00 sec.
INFO:  "t_ids": removed 795700 row versions in 206828 pages
DETAIL:  CPU 23.35s/23.50u sec elapsed 309.19 sec.
INFO:  "t_ids": found 795700 removable, 28948290 nonremovable row versions in 
223145 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 906106 unused item pointers.
0 pages are entirely empty.
CPU 63.10s/101.96u sec elapsed 1592.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_224670"
INFO:  index "pg_toast_224670_index" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_224670": found 0 removable, 0 nonremovable row versions in 0 
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  analyzing "public.t_ids"
INFO:  "t_ids": scanned 9 of 223146 pages, containing 11675055 live rows 
and 288 dead rows; 9 rows in sample, 28947131 estimated total rows
VACUUM

| Thu Jun 30 01:51:16 CEST 2005 Job finished after 1663 seconds

Re: [PERFORM] Reasons and drawbacks for unused item pointers (was: Update using primary key slow)

2005-10-27 Thread Martin Lesser

Tom Lane <[EMAIL PROTECTED]> writes:

> Denis <[EMAIL PROTECTED]> writes:
>> There were 1905028 unused item pointers.
> The "unused item pointers" number seems a bit high, but otherwise that
> looks pretty reasonable.
>
> Is it possible that the particular row you were updating has been
> updated quite a lot of times since the last vacuum?  Or even quite
> a few times within a single transaction?

What causes this "unused item pointers" and which impact do they have
regarding performance?

If I understood your last posting correctly more than one update on a
single row between two vacuum's would i.e. result in one ore more
"unused item pointer". Does this slow down the vacuum process and/or
other processes?  Until now I could not find an answer what this number
implies.

Regards

Martin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Reasons and drawbacks for unused item pointers

2005-10-27 Thread Martin Lesser

Tom Lane <[EMAIL PROTECTED]> writes:

> Martin Lesser <[EMAIL PROTECTED]> writes:
>> What causes this "unused item pointers" and which impact do they have
>> regarding performance?
> The direct performance impact is really pretty minimal (and none at
> all on indexscans, AFAIR).  The reason Denis' number drew my attention
> was that it implied that the table had gone un-vacuumed for awhile at
> some time in the past. [...]  To have 1905028 unused pointers in a
> table with only 5106307 live entries suggests that at some point there
> were 1.9 million (or so) dead but not-yet-vacuumed tuples, which
> suggests insufficient vacuuming.

Does each update of a single row result in an "unused item pointer"?
I.e. if I update one row 10 times between VACUUMing the table the result
are 10 unused pointers?

Some rows in some of my tables are updated much more frequently than
others so I'm not sure whether the number of unused pointers implie that
I should VACUUM more often than every 24 hours.

Martin

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Effects of cascading references in foreign keys

2005-10-29 Thread Martin Lesser

Which effects have UPDATEs on REFERENCEd TABLEs when only columns in the
referenced table are updated which are not part of the FOREIGN KEY
constraint?

I have one "master"-table like

 create table t_master (
   m_id   serial primary key,
   m_fld1 ...,
   m_fld2 ...,
   ...
 )

The table above is referenced from several (~30) other tables, i.e. like

 create table t_detail (
   d_ebid int REFERENCES t_master (m_id) ON UPDATE CASCADE ON DELETE CASCADE,
   d_fld1 ...,
   d_fld2 ...,
   ...
 )

All tables which reference t_master have appropriate indexes on the
referencing columns, vacuum/analyze is done regularly (daily).

Does an UPDATE of e.g. m_fld1 in t_master cause a 'lookup' in all tables
which have a cascading update-rule or is this 'lookup' only triggered if
the referenced column in t_master is explicitly updated? After removing
some detail tables which are not longer needed we see an improvemed
performance so at the moment it _looks_ like each update in t_master
triggers a 'lookup' in each referencing table also if the referenced
column (m_id) is not changed.

I've read "If the row is updated, but the referenced column is not
actually changed, no action is done." in the docs but it is not clear
for me whether this "no action" really means "null action" and so the
improved performance has other reasons.

TIA, Martin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Effects of cascading references in foreign keys

2005-10-30 Thread Martin Lesser

Bruce Momjian  writes:

> Thomas F. O'Connell wrote:
>> It seems like this warrants an item somewhere in the release notes,  
>> and I'm not currently seeing it (or a related item) anywhere. Perhaps  
>> E.1.3.1 (Performance Improvements)? For some of the more extreme  
>> UPDATE scenarios I've seen, this could be a big win.
> Hard to say, perhaps:
>
>   Prevent referential integrity triggers from firing if referenced
>   columns are not changed by an UPDATE
>
>   Previously, triggers would fire but do nothing.

And this "firing" has negative effects for the performance at least in
versions before 8.1 (we use 8.0.3 in our production).

One really dirty hack that comes in mind is to put an additional
pk_table (with only one field, the pk from the master) between the
"master"-table and the ~30 detail-tables so each update in the "master"
would in most cases only trigger a lookup in one table. Only if a pk was
really changed the CASCADEd trigger would force a triggered UPDATE in
the detail-tables.

After denormalization of two of the largest detail-tables into one table
the performance improvement was about 10% due to the fact that up to 1
mio. of rows (of about 30 mio) in the "master"-table are updated daily
and triggered a lookup in 190 mio. rows (before denormalization)
resp. 115 rows (after denormalization).

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[PERFORM] Partitioning / constrain exlusion not working with %-operator

2006-07-31 Thread Martin Lesser

I try to partition a large table (~ 120 mio. rows) into 50 smaller
tables but using the IMO immutable %-function constraint exclusion
does not work as expected:

CREATE TABLE tt_m (id1 int, cont varchar);
CREATE TABLE tt_0 (check (id1 % 50 = 0)) INHERITS (tt_m);
CREATE TABLE tt_1 (check (id1 % 50 = 1)) INHERITS (tt_m);

CREATE RULE ins_tt_0 AS ON INSERT TO tt_m WHERE id1 % 50 = 0 DO INSTEAD INSERT 
INTO tt_0 VALUES (new.*);
CREATE RULE ins_tt_1 AS ON INSERT TO tt_m WHERE id1 % 50 = 1 DO INSTEAD INSERT 
INTO tt_1 VALUES (new.*);
...
INSERT INTO tt_m (id1,cont) VALUES (0,'Test1');
INSERT INTO tt_m (id1,cont) VALUES (1,'Test2');

EXPLAIN SELECT * FROM tt_m WHERE id1=1;
  QUERY PLAN
---
 Result  (cost=0.00..73.50 rows=18 width=36)
   ->  Append  (cost=0.00..73.50 rows=18 width=36)
 ->  Seq Scan on tt_m  (cost=0.00..24.50 rows=6 width=36)
   Filter: (id1 = 1)
 ->  Seq Scan on tt_0 tt_m  (cost=0.00..24.50 rows=6 width=36)
   Filter: (id1 = 1)
 ->  Seq Scan on tt_1 tt_m  (cost=0.00..24.50 rows=6 width=36)
   Filter: (id1 = 1)
 ...

Only adding an explicit %-call to the query results in the expected plan:

EXPLAIN SELECT * FROM tt_m WHERE id1=1 AND id1 % 50 = 1;
  QUERY PLAN
---
 Result  (cost=0.00..60.60 rows=2 width=36)
   ->  Append  (cost=0.00..60.60 rows=2 width=36)
 ->  Seq Scan on tt_m  (cost=0.00..30.30 rows=1 width=36)
   Filter: ((id1 = 1) AND ((id1 % 50) = 1))
 ->  Seq Scan on tt_1 tt_m  (cost=0.00..30.30 rows=1 width=36)
   Filter: ((id1 = 1) AND ((id1 % 50) = 1))

Did I miss something and/or how could I force the planner to use
constraint exclusion without adding the explicit second condition above?

TIA, Martin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Partitioning / constrain exlusion not working with %-operator

2006-08-04 Thread Martin Lesser

Tom Lane <[EMAIL PROTECTED]> writes:

> It's usually better to use partitioning rules that have something to
> do with the WHERE-clauses you'd be using anyway.  For instance, try
> to partition on ranges.

I agree and tried to create new partitioned tables. But now I ran into
some other performance-related trouble when inserting (parts of) the old
(unpartioned) table into the new one:

CREATE TABLE t_unparted (id1 int, cont varchar);
-- Populate table with 1000 records with id1 from 1 to 1000 and ANALYZE

CREATE TABLE t_parted (id1 int, cont varchar);
CREATE TABLE t_parted_000 (check (id1 >=0 AND id1 < 100)) INHERITS (t_parted);
CREATE RULE ins_000 AS ON INSERT TO t_parted  WHERE id1 >= 0 AND id1 < 100 DO 
INSTEAD INSERT INTO t_parted_000 VALUES (new.*);
-- ... 8 more tables + 8 more rules
CREATE TABLE t_parted_900 (check (id1 >=900 AND id1 < 1000)) INHERITS 
(t_parted);
CREATE RULE ins_900 AS ON INSERT TO t_parted  WHERE id1 >= 900 AND id1 < 1000 
DO INSTEAD INSERT INTO t_parted_900 VALUES (new.*);

And now:

EXPLAIN INSERT INTO t_parted SELECT * FROM t_parted WHERE id1>=0 AND id1<100;

 Result  (cost=0.00..170.80 rows=12 width=36)
   ->  Append  (cost=0.00..170.80 rows=12 width=36)
 ->  Seq Scan on t_parted  (cost=0.00..85.40 rows=6 width=36)
   Filter: ((id1 >= 0) AND (id1 < 100) AND (((id1 >= 0) AND (id1 < 
100)) IS NOT TRUE) AND (((id1 >= 100) AND (id1 < 200)) IS NOT TRUE) AND (((id1 
>= 200) AND (id1 < 300)) IS NOT TRUE) AND (((id1 >= 300) AND (id1 < 400)) IS 
NOT TRUE) AND (((id1 >= 400) AND (id1 < 500)) IS NOT TRUE) AND (((id1 >= 500) 
AND (id1 < 600)) IS NOT TRUE) AND (((id1 >= 600) AND (id1 < 700)) IS NOT TRUE) 
AND (((id1 >= 700) AND (id1 < 800)) IS NOT TRUE) AND (((id1 >= 800) AND (id1 < 
900)) IS NOT TRUE) AND (((id1 >= 900) AND (id1 < 1000)) IS NOT TRUE))
 ->  Seq Scan on t_parted_000 t_parted  (cost=0.00..85.40 rows=6 
width=36)
   Filter: ((id1 >= 0) AND (id1 < 100) AND (((id1 >= 0) AND (id1 < 
100)) IS NOT TRUE) AND (((id1 >= 100) AND (id1 < 200)) IS NOT TRUE) AND (((id1 
>= 200) AND (id1 < 300)) IS NOT TRUE) AND (((id1 >= 300) AND (id1 < 400)) IS 
NOT TRUE) AND (((id1 >= 400) AND (id1 < 500)) IS NOT TRUE) AND (((id1 >= 500) 
AND (id1 < 600)) IS NOT TRUE) AND (((id1 >= 600) AND (id1 < 700)) IS NOT TRUE) 
AND (((id1 >= 700) AND (id1 < 800)) IS NOT TRUE) AND (((id1 >= 800) AND (id1 < 
900)) IS NOT TRUE) AND (((id1 >= 900) AND (id1 < 1000)) IS NOT TRUE))

 Result  (cost=0.00..66.40 rows=12 width=36)
   ->  Append  (cost=0.00..66.40 rows=12 width=36)
 ->  Seq Scan on t_parted  (cost=0.00..33.20 rows=6 width=36)
   Filter: ((id1 >= 0) AND (id1 < 100) AND (id1 >= 0) AND (id1 < 
100))
 ->  Seq Scan on t_parted_000 t_parted  (cost=0.00..33.20 rows=6 
width=36)
   Filter: ((id1 >= 0) AND (id1 < 100) AND (id1 >= 0) AND (id1 < 
100))
 ...
 Result  (cost=0.00..33.20 rows=6 width=36)
   ->  Append  (cost=0.00..33.20 rows=6 width=36)
 ->  Seq Scan on t_parted  (cost=0.00..33.20 rows=6 width=36)
   Filter: ((id1 >= 0) AND (id1 < 100) AND (id1 >= 900) AND (id1 < 
1000))
(58 rows)

The filters appended by the planner do not make any sense and cost too
much time if the old table is huge. (constraint_exclusion was ON)

Is there a better way to partition an existing table with a large
number of rows (>100 mio)?

TIA, Martin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[PERFORM] Vacuum becomes slow

Re: [PERFORM] Reasons and drawbacks for unused item pointers (was: Update using primary key slow)

Re: [PERFORM] Reasons and drawbacks for unused item pointers

[PERFORM] Effects of cascading references in foreign keys

Re: [PERFORM] Effects of cascading references in foreign keys

[PERFORM] Partitioning / constrain exlusion not working with %-operator

Re: [PERFORM] Partitioning / constrain exlusion not working with %-operator

7 matches

Site Navigation

Mail list logo

Footer information