[PERFORM] One tuple per transaction

2005-03-12 Thread Tambet Matiisen
Hi!

In one of our applications we have a database function, which
recalculates COGS (cost of good sold) for certain period. This involves
deleting bunch of rows from one table, inserting them again in correct
order and updating them one-by-one (sometimes one row twice) to reflect
current state. The problem is, that this generates an enormous amount of
tuples in that table.

If I'm correct, the dead tuples must be scanned also during table and
index scan, so a lot of dead tuples slows down queries considerably,
especially when the table doesn't fit into shared buffers any more. And
as I'm in transaction, I can't VACUUM to get rid of those tuples. In one
occasion the page count for a table went from 400 to 22000 at the end.

All this made me wonder, why is new tuple created after every update?
One tuple per transaction should be enough, because you always commit or
rollback transaction as whole. And my observations seem to indicate,
that new index tuple is created after column update even if this column
is not indexed.

One tuple per transaction would save a loads of I/O bandwidth, so I
believe there must be a reason why it isn't implemented as such. Or were
my assumptions wrong, that dead tuples must be read from disk?

  Tambet

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] One tuple per transaction

2005-03-12 Thread Josh Berkus
Tambet,

> In one of our applications we have a database function, which
> recalculates COGS (cost of good sold) for certain period. This involves
> deleting bunch of rows from one table, inserting them again in correct
> order and updating them one-by-one (sometimes one row twice) to reflect
> current state. The problem is, that this generates an enormous amount of
> tuples in that table.

Sounds like you have an application design problem ...  how about re-writing 
your function so it's a little more sensible?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Index use and slow queries

2005-03-12 Thread Tom Pfeifer




Hello,

My version of Postgresql is 7.4.3.
I have a simple table with 2 indexes:
 Table "public.tst"
 Column |    Type |  Modifiers
+-+-
 tst_id | bigint  | default nextval('tst_id_seq'::text)
 mmd5   | character varying(32)   | not null
 active | character(1)    | not null
 lud    | timestamp without time zone | default now()
Indexes:
    "tst_idx" unique, btree (mmd5, active)
    "tst_tst_id_key" unique, btree (tst_id)

There are exactly 1,000,000 (one million) rows in the table (tst).  There are no NULLS, empty columns in any row.

I get really fast response times when using the following select statement (Less than 1 second).
maach=# explain select * from tst where mmd5 = '71e1c18cbc708a0bf28fe106e03256c7' and active = 'A';
  QUERY PLAN
--
 Index Scan using tst_idx on tst  (cost=0.00..6.02 rows=1 width=57)
   Index Cond: (((mmd5)::text = '71e1c18cbc708a0bf28fe106e03256c7'::text) AND (active = 'A'::bpchar))
(2 rows)

I get really slow repoonse times when using the following select statement (About 20 seconds).
maach=# explain select * from tst where tst_id = 639246;
   QUERY PLAN

 Seq Scan on tst  (cost=0.00..23370.00 rows=1 width=57)
   Filter: (tst_id = 639246)
(2 rows)


Why is the second select statement so slow, it should be using the "tst_tst_id_key" unique, btree (tst_id) index, but instead EXPLAIN says it's using a Seq Scan.  If it was using the index, this select statement should be as fast if not faster than the above select statement.

When I turned off,  maach=# SET ENABLE_SEQSCAN TO OFF;
The slow select statement gets even slower.
maach=# explain select * from tst where tst_id = 639246;
 QUERY PLAN

 Seq Scan on tst  (cost=1.00..100023370.00 rows=1 width=57)
   Filter: (tst_id = 639246)
(2 rows)


Why do I have to use 2 columns to create a fast/efficient index?  I want to get the single column index to be the fastest index for my select statements.  How do I accomplish this.

Thanks,
Tom








Re: [PERFORM] Index use and slow queries

2005-03-12 Thread Russell Smith
On Sun, 13 Mar 2005 04:40 pm, Tom Pfeifer wrote:
> Hello,
> 
> 
> My version of Postgresql is 7.4.3. 
> I have a simple table with 2 indexes: 
>  Table "public.tst" 
>  Column |    Type |  Modifiers 
> +-+- 
>  tst_id | bigint  | default nextval('tst_id_seq'::text) 
>  mmd5   | character varying(32)   | not null 
>  active | character(1)    | not null 
>  lud    | timestamp without time zone | default now() 
> Indexes: 
>     "tst_idx" unique, btree (mmd5, active) 
>     "tst_tst_id_key" unique, btree (tst_id) 
>  
> 
> 
> There are exactly 1,000,000 (one million) rows in the table (tst).  There are 
> no NULLS, empty columns in any row.
> 
> 
> I get really fast response times when using the following select statement 
> (Less than 1 second). 
> maach=# explain select * from tst where mmd5 = 
> '71e1c18cbc708a0bf28fe106e03256c7' and active = 'A'; 
>   QUERY PLAN 
> --
>  
>  Index Scan using tst_idx on tst  (cost=0.00..6.02 rows=1 width=57) 
>    Index Cond: (((mmd5)::text = '71e1c18cbc708a0bf28fe106e03256c7'::text) AND 
> (active = 'A'::bpchar)) 
> (2 rows) 
>  
> 
> 
> I get really slow repoonse times when using the following select statement 
> (About 20 seconds). 
> maach=# explain select * from tst where tst_id = 639246; 

Before 8.0, bigint would not use an index unless you cast it, or quote it.

eg
explain select * from tst where tst_id = 639246::int8; 
explain select * from tst where tst_id = '639246'; 

Hope this helps.

Russell Smith

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Index use and slow queries

2005-03-12 Thread Tom Lane
Russell Smith <[EMAIL PROTECTED]> writes:
> On Sun, 13 Mar 2005 04:40 pm, Tom Pfeifer wrote:
>> I get really slow repoonse times when using the following select statement 
>> (About 20 seconds). 
>> maach=# explain select * from tst where tst_id = 639246; 

> Before 8.0, bigint would not use an index unless you cast it, or quote it.

> explain select * from tst where tst_id = 639246::int8; 
> explain select * from tst where tst_id = '639246'; 

... or you compare to a value large enough to be int8 naturally, eg

> explain select * from tst where tst_id = 123456639246;

The issue here is that (a) 639246 is naturally typed as int4, and
(b) before 8.0 we couldn't use cross-type comparisons such as int8 = int4
with an index.

You can find a whole lot of angst about this issue and related ones
if you care to review the last six or eight years of the pgsql-hackers
archives.  It was only recently that we found a way to support
cross-type index operations without breaking the fundamental
type-extensibility features of Postgres.  (In hindsight, we spent way
too much time fixated on the notion that we needed to find a way to
implicitly convert the non-indexed value to match the indexed column's
type, rather than biting the bullet and supporting cross-type operations
directly with indexes.  Oh well, hindsight is always 20/20.)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]