[PERFORM] One tuple per transaction
Hi! In one of our applications we have a database function, which recalculates COGS (cost of good sold) for certain period. This involves deleting bunch of rows from one table, inserting them again in correct order and updating them one-by-one (sometimes one row twice) to reflect current state. The problem is, that this generates an enormous amount of tuples in that table. If I'm correct, the dead tuples must be scanned also during table and index scan, so a lot of dead tuples slows down queries considerably, especially when the table doesn't fit into shared buffers any more. And as I'm in transaction, I can't VACUUM to get rid of those tuples. In one occasion the page count for a table went from 400 to 22000 at the end. All this made me wonder, why is new tuple created after every update? One tuple per transaction should be enough, because you always commit or rollback transaction as whole. And my observations seem to indicate, that new index tuple is created after column update even if this column is not indexed. One tuple per transaction would save a loads of I/O bandwidth, so I believe there must be a reason why it isn't implemented as such. Or were my assumptions wrong, that dead tuples must be read from disk? Tambet ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] One tuple per transaction
Tambet, > In one of our applications we have a database function, which > recalculates COGS (cost of good sold) for certain period. This involves > deleting bunch of rows from one table, inserting them again in correct > order and updating them one-by-one (sometimes one row twice) to reflect > current state. The problem is, that this generates an enormous amount of > tuples in that table. Sounds like you have an application design problem ... how about re-writing your function so it's a little more sensible? -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Index use and slow queries
Hello, My version of Postgresql is 7.4.3. I have a simple table with 2 indexes: Table "public.tst" Column | Type | Modifiers +-+- tst_id | bigint | default nextval('tst_id_seq'::text) mmd5 | character varying(32) | not null active | character(1) | not null lud | timestamp without time zone | default now() Indexes: "tst_idx" unique, btree (mmd5, active) "tst_tst_id_key" unique, btree (tst_id) There are exactly 1,000,000 (one million) rows in the table (tst). There are no NULLS, empty columns in any row. I get really fast response times when using the following select statement (Less than 1 second). maach=# explain select * from tst where mmd5 = '71e1c18cbc708a0bf28fe106e03256c7' and active = 'A'; QUERY PLAN -- Index Scan using tst_idx on tst (cost=0.00..6.02 rows=1 width=57) Index Cond: (((mmd5)::text = '71e1c18cbc708a0bf28fe106e03256c7'::text) AND (active = 'A'::bpchar)) (2 rows) I get really slow repoonse times when using the following select statement (About 20 seconds). maach=# explain select * from tst where tst_id = 639246; QUERY PLAN Seq Scan on tst (cost=0.00..23370.00 rows=1 width=57) Filter: (tst_id = 639246) (2 rows) Why is the second select statement so slow, it should be using the "tst_tst_id_key" unique, btree (tst_id) index, but instead EXPLAIN says it's using a Seq Scan. If it was using the index, this select statement should be as fast if not faster than the above select statement. When I turned off, maach=# SET ENABLE_SEQSCAN TO OFF; The slow select statement gets even slower. maach=# explain select * from tst where tst_id = 639246; QUERY PLAN Seq Scan on tst (cost=1.00..100023370.00 rows=1 width=57) Filter: (tst_id = 639246) (2 rows) Why do I have to use 2 columns to create a fast/efficient index? I want to get the single column index to be the fastest index for my select statements. How do I accomplish this. Thanks, Tom
Re: [PERFORM] Index use and slow queries
On Sun, 13 Mar 2005 04:40 pm, Tom Pfeifer wrote: > Hello, > > > My version of Postgresql is 7.4.3. > I have a simple table with 2 indexes: > Table "public.tst" > Column | Type | Modifiers > +-+- > tst_id | bigint | default nextval('tst_id_seq'::text) > mmd5 | character varying(32) | not null > active | character(1) | not null > lud | timestamp without time zone | default now() > Indexes: > "tst_idx" unique, btree (mmd5, active) > "tst_tst_id_key" unique, btree (tst_id) > > > > There are exactly 1,000,000 (one million) rows in the table (tst). There are > no NULLS, empty columns in any row. > > > I get really fast response times when using the following select statement > (Less than 1 second). > maach=# explain select * from tst where mmd5 = > '71e1c18cbc708a0bf28fe106e03256c7' and active = 'A'; > QUERY PLAN > -- > > Index Scan using tst_idx on tst (cost=0.00..6.02 rows=1 width=57) > Index Cond: (((mmd5)::text = '71e1c18cbc708a0bf28fe106e03256c7'::text) AND > (active = 'A'::bpchar)) > (2 rows) > > > > I get really slow repoonse times when using the following select statement > (About 20 seconds). > maach=# explain select * from tst where tst_id = 639246; Before 8.0, bigint would not use an index unless you cast it, or quote it. eg explain select * from tst where tst_id = 639246::int8; explain select * from tst where tst_id = '639246'; Hope this helps. Russell Smith ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Index use and slow queries
Russell Smith <[EMAIL PROTECTED]> writes: > On Sun, 13 Mar 2005 04:40 pm, Tom Pfeifer wrote: >> I get really slow repoonse times when using the following select statement >> (About 20 seconds). >> maach=# explain select * from tst where tst_id = 639246; > Before 8.0, bigint would not use an index unless you cast it, or quote it. > explain select * from tst where tst_id = 639246::int8; > explain select * from tst where tst_id = '639246'; ... or you compare to a value large enough to be int8 naturally, eg > explain select * from tst where tst_id = 123456639246; The issue here is that (a) 639246 is naturally typed as int4, and (b) before 8.0 we couldn't use cross-type comparisons such as int8 = int4 with an index. You can find a whole lot of angst about this issue and related ones if you care to review the last six or eight years of the pgsql-hackers archives. It was only recently that we found a way to support cross-type index operations without breaking the fundamental type-extensibility features of Postgres. (In hindsight, we spent way too much time fixated on the notion that we needed to find a way to implicitly convert the non-indexed value to match the indexed column's type, rather than biting the bullet and supporting cross-type operations directly with indexes. Oh well, hindsight is always 20/20.) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]