date:20060327

Re: [PERFORM] [GENERAL] experiences needed - how does Itanium2/1.5GHz(4MB) compare to AMD and Intel CPUs as far as Postgresql is concerned

2006-03-27 Thread Jim C. Nasby

On Sat, Mar 25, 2006 at 08:11:22PM +0100, Tomaz Borstnar wrote:
> Hello!
> 
>   First tried some searching around, but did not find anything useful 
>   so I gave up and decided to ask here... I am wondering how do pair of 
> 1.5GHz Itanium2(4MB cache) stack up against pair of AMD or Intel server 
> CPUs as far as postgresql performance is concerned? Is it worthy or not?

-performance would be a better place to ask, so I'm moving this there.

The general consensus is that your best bet CPU-wise is Opterons.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[PERFORM] count(*) performance

2006-03-27 Thread Gábriel Ákos


Hi,

I guess this is an age-old 100times answered question, but I didn't find 
the answer to it yet (neither in the FAQ nor in the mailing list archives).


Question: I have a table with 2.5M rows. count(*) on this table is 
running 4 minutes long. (dual opteron, 4gig ram, db on 4 disk raid10 
array (sata, not scsi)) Is this normal? How could I make it run faster?
Maybe make it run faster for the 2nd time? Which parameters should I 
change in postgresql.conf and how?






--
Üdvözlettel,
Gábriel Ákos
-=E-Mail :[EMAIL PROTECTED]|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353|Mobil:+36209278894=-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] count(*) performance

2006-03-27 Thread Jim C. Nasby

On Mon, Mar 27, 2006 at 03:34:32PM +0200, G?briel ?kos wrote:
> Hi,
> 
> I guess this is an age-old 100times answered question, but I didn't find 
> the answer to it yet (neither in the FAQ nor in the mailing list archives).
> 
> Question: I have a table with 2.5M rows. count(*) on this table is 
> running 4 minutes long. (dual opteron, 4gig ram, db on 4 disk raid10 
> array (sata, not scsi)) Is this normal? How could I make it run faster?
> Maybe make it run faster for the 2nd time? Which parameters should I 
> change in postgresql.conf and how?

First, count(*) on PostgreSQL tends to be slow because you can't do
index covering[1].

But in this case, I'd bet money that if it's taking 4 minutes something
else is wrong. Have you been vacuuming that table frequently enough?
What's SELECT relpages FROM pg_class WHERE relname='tablename' show?

[1] 
http://www.pervasive-postgres.com/lp/newsletters/2006/Insights_postgres_Feb.asp#5
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] Query parallelism

2006-03-27 Thread luchot

Hello everybody ,
I use PostgreSQL 8.1.3 on a bi-processor Xeon and I would know how to do to enable a parallelism for 
the execution of  queries. Indeed , when I analyse the use of the cpus during a query the result is that for 
some minutes a cpu is used while the other not and after it is the contrary. So they are not used at the same 
time and i would know what i have to do in order cpus work together .
Thanks and sorry for my english,

Re: [PERFORM] Query parallelism

2006-03-27 Thread Jim C. Nasby

On Mon, Mar 27, 2006 at 04:25:25PM +0200, luchot wrote:
> Hello everybody ,
> I use PostgreSQL 8.1.3 on a bi-processor Xeon and I would know how to do to 
> enable a parallelism for 
> the execution of  queries. Indeed , when I analyse the use of the cpus during 
> a query the result is that for 
> some minutes a cpu is used while the other not and after it is the contrary. 
> So they are not used at the same 
> time and i would know what i have to do in order cpus work together .
> Thanks and sorry for my english,

PostgreSQL has no support for intra-query parallelism at this time.
Greenplum's MPP might do what you're looking for.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] count(*) performance

2006-03-27 Thread Luke Lonergan

Gabriel,

On 3/27/06 5:34 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote:

> Question: I have a table with 2.5M rows. count(*) on this table is
> running 4 minutes long. (dual opteron, 4gig ram, db on 4 disk raid10
> array (sata, not scsi)) Is this normal? How could I make it run faster?
> Maybe make it run faster for the 2nd time? Which parameters should I
> change in postgresql.conf and how?

Before changing anything with your Postgres configuration, you should check
your hard drive array performance.  All select count(*) does is a sequential
scan of your data, and if the table is larger than memory, or if it's the
first time you've scanned it, it is limited by your disk speed.

To test your disk speed, use the following commands and report the times
here:

  time bash -c "dd if=/dev/zero of=bigfile bs=8k count=50 && sync"
  time dd if=bigfile of=/dev/null bs=8k

If these are taking a long time, from another session watch the I/O rate
with "vmstat 1" for a while and report that here.

- Luke   

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] count(*) performance

2006-03-27 Thread Gábriel Ákos


Luke Lonergan wrote:


To test your disk speed, use the following commands and report the times
here:

  time bash -c "dd if=/dev/zero of=bigfile bs=8k count=50 && sync"


[EMAIL PROTECTED]:/fast #   time bash -c "dd if=/dev/zero of=bigfile bs=8k
count=50 && sync"
50+0 records in
50+0 records out
409600 bytes transferred in 45.469404 seconds (90082553 bytes/sec)
real0m56.880s
user0m0.112s
sys 0m18.937s


  time dd if=bigfile of=/dev/null bs=8k


[EMAIL PROTECTED]:/fast #   time dd if=bigfile of=/dev/null bs=8k
50+0 records in
50+0 records out
409600 bytes transferred in 53.542147 seconds (76500481 bytes/sec)

real0m53.544s
user0m0.048s
sys 0m10.637s

I guess these values aren't that bad :)

--
Üdvözlettel,
Gábriel Ákos
-=E-Mail :[EMAIL PROTECTED]|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353|Mobil:+36209278894=-


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] count(*) performance

2006-03-27 Thread Gábriel Ákos


Jim C. Nasby wrote:


But in this case, I'd bet money that if it's taking 4 minutes something
else is wrong. Have you been vacuuming that table frequently enough?


That gave me an idea. I thought that autovacuum is doing it right, but I
issued a vacuum full analyze verbose , and it worked all the day.
After that I've tweaked memory settings a bit too (more fsm_pages)

Now:

staging=# SELECT count(*) from infx.infx_product;
  count
-
 3284997
(1 row)

Time: 1301.049 ms

As I saw the output, the database was compressed to 10% of its size :)
This table has quite big changes every 4 hour, let's see how it works.
Maybe I'll have to issue full vacuums from cron regularly.


What's SELECT relpages FROM pg_class WHERE relname='tablename' show?


This went to 10% as well, now it's around 156000 pages.

Regards,
Akos


--
Üdvözlettel,
Gábriel Ákos
-=E-Mail :[EMAIL PROTECTED]|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353|Mobil:+36209278894=-


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] count(*) performance

2006-03-27 Thread Luke Lonergan

Gabriel,

On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote:

> That gave me an idea. I thought that autovacuum is doing it right, but I
> issued a vacuum full analyze verbose , and it worked all the day.
> After that I've tweaked memory settings a bit too (more fsm_pages)

Oops! I replied to your disk speed before I saw this.

The only thing is - you probably don't want to do a "vacuum full", but
rather a simple "vacuum" should be enough.

- Luke

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] count(*) performance

2006-03-27 Thread Gábriel Ákos


Luke Lonergan wrote:

Gabriel,

On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote:


That gave me an idea. I thought that autovacuum is doing it right, but I
issued a vacuum full analyze verbose , and it worked all the day.
After that I've tweaked memory settings a bit too (more fsm_pages)


Oops! I replied to your disk speed before I saw this.

The only thing is - you probably don't want to do a "vacuum full", but
rather a simple "vacuum" should be enough.


I thought that too. Autovacuum is running on our system but it didn't do 
the trick. Anyway the issue is solved, thank you all for helping. :)



--
Üdvözlettel,
Gábriel Ákos
-=E-Mail :[EMAIL PROTECTED]|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353|Mobil:+36209278894=-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[PERFORM] simple join uses indexes, very slow

2006-03-27 Thread george young

[PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1]
I have a simple join on two tables that takes way too long.  Can you help
me understand what's wrong?  There are indexes defined on the relevant columns.
I just did a fresh vacuum --full --analyze on the two tables.
Is there something I'm not seeing?
[CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk]
-- George Young

Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples.

explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, 
parameters p WHERE ro.run = p.run AND ro.opset_num = p.opset_num and 
ro.run='team9';
 QUERY PLAN

 Nested Loop  (cost=2.16..7957.40 rows=6707 width=22) (actual 
time=14.986..70197.129 rows=43050 loops=1)
   ->  Index Scan using run_opsets_pkey on run_opsets ro  (cost=0.00..128.75 
rows=71 width=18) (actual time=0.386..62.959 rows=263 loops=1)
 Index Cond: (run = 'team9'::text)
   ->  Bitmap Heap Scan on parameters p  (cost=2.16..109.93 rows=27 width=22) 
(actual time=1.591..266.211 rows=164 loops=263)
 Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = 
p.opset_num))
 ->  Bitmap Index Scan on parameters_idx  (cost=0.00..2.16 rows=27 
width=0) (actual time=1.153..1.153 rows=164 loops=263)
   Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = 
p.opset_num))
 Total runtime: 70237.727 ms
(8 rows)

 Table "public.run_opsets"
Column|Type |Modifiers
--+-+-
 run  | text| not null
 opset| text|
 opset_ver| integer |
 opset_num| integer | not null
 status   | opset_status|
 date_started | timestamp without time zone |
 date_done| timestamp without time zone |
 work_started | timestamp without time zone |
 lock_user| text| default 'NO-USER'::text
 lock_pid | integer |
 needs_review | text|
Indexes:
"run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER


--  Table "public.parameters"
  Column   |  Type   |   Modifiers
---+-+---
 run   | text| not null
 opset_num | integer | not null
 opset | text| not null
 opset_ver | integer | not null
 step_num  | integer | not null
 step  | text| not null
 step_ver  | integer | not null
 name  | text| not null
 value | text|
 split | boolean | not null default false
 wafers| text[]  | not null default '{}'::text[]
Indexes:
"parameters_idx" btree (run, opset_num, step_num, opset, opset_ver, step, 
step_ver, name, split, wafers)
"parameters_opset_idx" btree (opset, step, name)
"parameters_step_idx" btree (step, name)


"Are the gods not just?"  "Oh no, child.
What would become of us if they were?" (CSL)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] count(*) performance

2006-03-27 Thread Brendan Duddridge

Does that mean that even though autovacuum is turned on, you still  
should do a regular vacuum analyze periodically?


Thanks,


Brendan Duddridge | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED]

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On Mar 27, 2006, at 11:14 AM, Luke Lonergan wrote:


Gabriel,

On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote:

That gave me an idea. I thought that autovacuum is doing it right,  
but I

issued a vacuum full analyze verbose , and it worked all the day.
After that I've tweaked memory settings a bit too (more fsm_pages)


Oops! I replied to your disk speed before I saw this.

The only thing is - you probably don't want to do a "vacuum full", but
rather a simple "vacuum" should be enough.

- Luke



---(end of  
broadcast)---

TIP 5: don't forget to increase your free space map settings





smime.p7s
Description: S/MIME cryptographic signature

Re: [PERFORM] count(*) performance

2006-03-27 Thread Matthew T. O'Connor


Gábriel Ákos wrote:

Luke Lonergan wrote:

Gabriel,

On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote:


That gave me an idea. I thought that autovacuum is doing it right, but I
issued a vacuum full analyze verbose , and it worked all the day.
After that I've tweaked memory settings a bit too (more fsm_pages)


Oops! I replied to your disk speed before I saw this.

The only thing is - you probably don't want to do a "vacuum full", but
rather a simple "vacuum" should be enough.


I thought that too. Autovacuum is running on our system but it didn't do 
the trick. Anyway the issue is solved, thank you all for helping. :)


Yeah, it would be nice of autovacuum had some way of raising a flag to 
the admin that given current settings (thresholds, FSM etc...), it's not 
keeping up with the activity.  I don't know how to do this, but I hope 
someone else has some good ideas.


Matt


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] count(*) performance

2006-03-27 Thread Alvaro Herrera

Brendan Duddridge wrote:
> Does that mean that even though autovacuum is turned on, you still  
> should do a regular vacuum analyze periodically?

No, it probably means you have set FSM settings too low, or not tuned
the autovacuum parameters to your specific situation.

A bug in the autovacuum daemon is not unexpected however, so if it
doesn't work after tuning, let us know.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] count(*) performance

2006-03-27 Thread Guido Neitzer


On 27.03.2006, at 21:20 Uhr, Brendan Duddridge wrote:

Does that mean that even though autovacuum is turned on, you still  
should do a regular vacuum analyze periodically?


It seems that there are situations where autovacuum does not a really  
good job.


However, in our application I have made stupid design decision which  
I want to change as soon as possible. I have a "visit count" column  
in one of the very large tables, so updates are VERY regular. I've  
just checked and saw that autovacuum does a great job with that.


Nevertheless I have set up a cron job to do a standard vacuum every  
month. I've used vacuum full only once after I did a bulk update of  
about 200.000 rows ...


cug

--
PharmaLine, Essen, GERMANY
Software and Database Development




smime.p7s
Description: S/MIME cryptographic signature

[PERFORM] Large Binary Objects Middleware

2006-03-27 Thread Rodrigo Madera

Hello to all on the list.

I have developed a product that sits between the database and an
application that handles the storage of large binary data.

The system is fast, but I'm feeling bad as to think that I have
completely reinvented the weel on this case.

You see, the engine does just stores the large data in "containers"
that are directly on the filesystem instead of going to the database
directly (since some of this list's members told me it would make the
database really slow to store the data directly).

So now I have a huge dilema as to continue this reinvention or use
direct large objects.

The database is holding large ammounts of digital video, and I am
wanting to put these directly into the database. What performance
guidelines would you all give seeing my position today?

Thanks for all your speed-up tips,
Rodrigo

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] count(*) performance

2006-03-27 Thread Mikael Carneholm

This is where a "last_vacuumed" (and "last_analyzed") column in
pg_statistic(?) would come in handy. Each time vacuum or analyze has
finished, update the row for the specific table that was
vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed
column. No more guessing "maybe I haven't vacuumed/analyzed in a while",
and each time a user complains about bad performance, one could request
the user to do a "select s.last_vacuumed, s.last_analyzed from
pg_statistic s, pg_attribute a, pg_class c where ..."

It SOUNDS easy to implement, but that has fooled me before... :-)

- Mikael

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Guido
Neitzer
Sent: den 27 mars 2006 21:44
To: Brendan Duddridge
Cc: Postgresql Performance
Subject: Re: [PERFORM] count(*) performance

On 27.03.2006, at 21:20 Uhr, Brendan Duddridge wrote:

> Does that mean that even though autovacuum is turned on, you still  
> should do a regular vacuum analyze periodically?

It seems that there are situations where autovacuum does not a really  
good job.

However, in our application I have made stupid design decision which  
I want to change as soon as possible. I have a "visit count" column  
in one of the very large tables, so updates are VERY regular. I've  
just checked and saw that autovacuum does a great job with that.

Nevertheless I have set up a cron job to do a standard vacuum every  
month. I've used vacuum full only once after I did a bulk update of  
about 200.000 rows ...

cug

-- 
PharmaLine, Essen, GERMANY
Software and Database Development

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] count(*) performance

2006-03-27 Thread Matthew T. O'Connor


Mikael Carneholm wrote:

This is where a "last_vacuumed" (and "last_analyzed") column in
pg_statistic(?) would come in handy. Each time vacuum or analyze has
finished, update the row for the specific table that was
vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed
column. No more guessing "maybe I haven't vacuumed/analyzed in a while",
and each time a user complains about bad performance, one could request
the user to do a "select s.last_vacuumed, s.last_analyzed from
pg_statistic s, pg_attribute a, pg_class c where ..."

It SOUNDS easy to implement, but that has fooled me before... :-)



It is fairly easy to implement, however it has been discussed before and 
decided that it wasn't necessary.  What the system cares about is how 
long it's been since the last vacuum in terms of XIDs not time.  Storing 
a timestamp would make it more human readable, but I'm not sure the 
powers that be want to add two new columns to some system table to 
accommodate this.


Matt

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] count(*) performance

2006-03-27 Thread Tom Lane

"Matthew T. O'Connor"  writes:
> It is fairly easy to implement, however it has been discussed before and 
> decided that it wasn't necessary.  What the system cares about is how 
> long it's been since the last vacuum in terms of XIDs not time.

I think Alvaro is intending to do the latter (store per-table vacuum xid
info) for 8.2.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] count(*) performance

2006-03-27 Thread Mikael Carneholm

I think it is definitely necessary from an administration point of view - as an 
administrator, I want to know:

1) Are there any stats (at all) in a schema
2) Are there any stats on the table that slow_query_foo is targeting
3) If I have stats, how recent are they
4) Could it be that there are a lot of dead tuples lying around (given the 
amount of traffic I know I have)

These would be (are always!) the first questions I ask myself when I'm about to 
identify performance problems in an app, don't know how other people do though 
:)

Maybe something I'll try to look into this weekend, if I can spare some time.

- Mikael


-Original Message-
From: Matthew T. O'Connor [mailto:[EMAIL PROTECTED]
Sent: den 28 mars 2006 00:43
To: Mikael Carneholm
Cc: Postgresql Performance
Subject: Re: [PERFORM] count(*) performance


Mikael Carneholm wrote:
> This is where a "last_vacuumed" (and "last_analyzed") column in
> pg_statistic(?) would come in handy. Each time vacuum or analyze has
> finished, update the row for the specific table that was
> vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed
> column. No more guessing "maybe I haven't vacuumed/analyzed in a while",
> and each time a user complains about bad performance, one could request
> the user to do a "select s.last_vacuumed, s.last_analyzed from
> pg_statistic s, pg_attribute a, pg_class c where ..."
> 
> It SOUNDS easy to implement, but that has fooled me before... :-)


It is fairly easy to implement, however it has been discussed before and 
decided that it wasn't necessary.  What the system cares about is how 
long it's been since the last vacuum in terms of XIDs not time.  Storing 
a timestamp would make it more human readable, but I'm not sure the 
powers that be want to add two new columns to some system table to 
accommodate this.

Matt

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] [GENERAL] experiences needed - how does Itanium2/1.5GHz(4MB) compare to AMD and Intel CPUs as far as Postgresql is concerned

[PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

[PERFORM] Query parallelism

Re: [PERFORM] Query parallelism

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

[PERFORM] simple join uses indexes, very slow

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

[PERFORM] Large Binary Objects Middleware

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

Re: [PERFORM] count(*) performance

20 matches

Site Navigation

Mail list logo

Footer information