Re: [PERFORM] [GENERAL] experiences needed - how does Itanium2/1.5GHz(4MB) compare to AMD and Intel CPUs as far as Postgresql is concerned
On Sat, Mar 25, 2006 at 08:11:22PM +0100, Tomaz Borstnar wrote: > Hello! > > First tried some searching around, but did not find anything useful > so I gave up and decided to ask here... I am wondering how do pair of > 1.5GHz Itanium2(4MB cache) stack up against pair of AMD or Intel server > CPUs as far as postgresql performance is concerned? Is it worthy or not? -performance would be a better place to ask, so I'm moving this there. The general consensus is that your best bet CPU-wise is Opterons. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] count(*) performance
Hi, I guess this is an age-old 100times answered question, but I didn't find the answer to it yet (neither in the FAQ nor in the mailing list archives). Question: I have a table with 2.5M rows. count(*) on this table is running 4 minutes long. (dual opteron, 4gig ram, db on 4 disk raid10 array (sata, not scsi)) Is this normal? How could I make it run faster? Maybe make it run faster for the 2nd time? Which parameters should I change in postgresql.conf and how? -- Üdvözlettel, Gábriel Ákos -=E-Mail :[EMAIL PROTECTED]|Web: http://www.i-logic.hu=- -=Tel/fax:+3612367353|Mobil:+36209278894=- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] count(*) performance
On Mon, Mar 27, 2006 at 03:34:32PM +0200, G?briel ?kos wrote: > Hi, > > I guess this is an age-old 100times answered question, but I didn't find > the answer to it yet (neither in the FAQ nor in the mailing list archives). > > Question: I have a table with 2.5M rows. count(*) on this table is > running 4 minutes long. (dual opteron, 4gig ram, db on 4 disk raid10 > array (sata, not scsi)) Is this normal? How could I make it run faster? > Maybe make it run faster for the 2nd time? Which parameters should I > change in postgresql.conf and how? First, count(*) on PostgreSQL tends to be slow because you can't do index covering[1]. But in this case, I'd bet money that if it's taking 4 minutes something else is wrong. Have you been vacuuming that table frequently enough? What's SELECT relpages FROM pg_class WHERE relname='tablename' show? [1] http://www.pervasive-postgres.com/lp/newsletters/2006/Insights_postgres_Feb.asp#5 -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Query parallelism
Hello everybody , I use PostgreSQL 8.1.3 on a bi-processor Xeon and I would know how to do to enable a parallelism for the execution of queries. Indeed , when I analyse the use of the cpus during a query the result is that for some minutes a cpu is used while the other not and after it is the contrary. So they are not used at the same time and i would know what i have to do in order cpus work together . Thanks and sorry for my english,
Re: [PERFORM] Query parallelism
On Mon, Mar 27, 2006 at 04:25:25PM +0200, luchot wrote: > Hello everybody , > I use PostgreSQL 8.1.3 on a bi-processor Xeon and I would know how to do to > enable a parallelism for > the execution of queries. Indeed , when I analyse the use of the cpus during > a query the result is that for > some minutes a cpu is used while the other not and after it is the contrary. > So they are not used at the same > time and i would know what i have to do in order cpus work together . > Thanks and sorry for my english, PostgreSQL has no support for intra-query parallelism at this time. Greenplum's MPP might do what you're looking for. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] count(*) performance
Gabriel, On 3/27/06 5:34 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote: > Question: I have a table with 2.5M rows. count(*) on this table is > running 4 minutes long. (dual opteron, 4gig ram, db on 4 disk raid10 > array (sata, not scsi)) Is this normal? How could I make it run faster? > Maybe make it run faster for the 2nd time? Which parameters should I > change in postgresql.conf and how? Before changing anything with your Postgres configuration, you should check your hard drive array performance. All select count(*) does is a sequential scan of your data, and if the table is larger than memory, or if it's the first time you've scanned it, it is limited by your disk speed. To test your disk speed, use the following commands and report the times here: time bash -c "dd if=/dev/zero of=bigfile bs=8k count=50 && sync" time dd if=bigfile of=/dev/null bs=8k If these are taking a long time, from another session watch the I/O rate with "vmstat 1" for a while and report that here. - Luke ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] count(*) performance
Luke Lonergan wrote: To test your disk speed, use the following commands and report the times here: time bash -c "dd if=/dev/zero of=bigfile bs=8k count=50 && sync" [EMAIL PROTECTED]:/fast # time bash -c "dd if=/dev/zero of=bigfile bs=8k count=50 && sync" 50+0 records in 50+0 records out 409600 bytes transferred in 45.469404 seconds (90082553 bytes/sec) real0m56.880s user0m0.112s sys 0m18.937s time dd if=bigfile of=/dev/null bs=8k [EMAIL PROTECTED]:/fast # time dd if=bigfile of=/dev/null bs=8k 50+0 records in 50+0 records out 409600 bytes transferred in 53.542147 seconds (76500481 bytes/sec) real0m53.544s user0m0.048s sys 0m10.637s I guess these values aren't that bad :) -- Üdvözlettel, Gábriel Ákos -=E-Mail :[EMAIL PROTECTED]|Web: http://www.i-logic.hu=- -=Tel/fax:+3612367353|Mobil:+36209278894=- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] count(*) performance
Jim C. Nasby wrote: But in this case, I'd bet money that if it's taking 4 minutes something else is wrong. Have you been vacuuming that table frequently enough? That gave me an idea. I thought that autovacuum is doing it right, but I issued a vacuum full analyze verbose , and it worked all the day. After that I've tweaked memory settings a bit too (more fsm_pages) Now: staging=# SELECT count(*) from infx.infx_product; count - 3284997 (1 row) Time: 1301.049 ms As I saw the output, the database was compressed to 10% of its size :) This table has quite big changes every 4 hour, let's see how it works. Maybe I'll have to issue full vacuums from cron regularly. What's SELECT relpages FROM pg_class WHERE relname='tablename' show? This went to 10% as well, now it's around 156000 pages. Regards, Akos -- Üdvözlettel, Gábriel Ákos -=E-Mail :[EMAIL PROTECTED]|Web: http://www.i-logic.hu=- -=Tel/fax:+3612367353|Mobil:+36209278894=- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] count(*) performance
Gabriel, On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote: > That gave me an idea. I thought that autovacuum is doing it right, but I > issued a vacuum full analyze verbose , and it worked all the day. > After that I've tweaked memory settings a bit too (more fsm_pages) Oops! I replied to your disk speed before I saw this. The only thing is - you probably don't want to do a "vacuum full", but rather a simple "vacuum" should be enough. - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] count(*) performance
Luke Lonergan wrote: Gabriel, On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote: That gave me an idea. I thought that autovacuum is doing it right, but I issued a vacuum full analyze verbose , and it worked all the day. After that I've tweaked memory settings a bit too (more fsm_pages) Oops! I replied to your disk speed before I saw this. The only thing is - you probably don't want to do a "vacuum full", but rather a simple "vacuum" should be enough. I thought that too. Autovacuum is running on our system but it didn't do the trick. Anyway the issue is solved, thank you all for helping. :) -- Üdvözlettel, Gábriel Ákos -=E-Mail :[EMAIL PROTECTED]|Web: http://www.i-logic.hu=- -=Tel/fax:+3612367353|Mobil:+36209278894=- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] simple join uses indexes, very slow
[PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1] I have a simple join on two tables that takes way too long. Can you help me understand what's wrong? There are indexes defined on the relevant columns. I just did a fresh vacuum --full --analyze on the two tables. Is there something I'm not seeing? [CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk] -- George Young Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745 tuples. explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num = p.opset_num and ro.run='team9'; QUERY PLAN Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual time=14.986..70197.129 rows=43050 loops=1) -> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959 rows=263 loops=1) Index Cond: (run = 'team9'::text) -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27 width=22) (actual time=1.591..266.211 rows=164 loops=263) Recheck Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16 rows=27 width=0) (actual time=1.153..1.153 rows=164 loops=263) Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num)) Total runtime: 70237.727 ms (8 rows) Table "public.run_opsets" Column|Type |Modifiers --+-+- run | text| not null opset| text| opset_ver| integer | opset_num| integer | not null status | opset_status| date_started | timestamp without time zone | date_done| timestamp without time zone | work_started | timestamp without time zone | lock_user| text| default 'NO-USER'::text lock_pid | integer | needs_review | text| Indexes: "run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER -- Table "public.parameters" Column | Type | Modifiers ---+-+--- run | text| not null opset_num | integer | not null opset | text| not null opset_ver | integer | not null step_num | integer | not null step | text| not null step_ver | integer | not null name | text| not null value | text| split | boolean | not null default false wafers| text[] | not null default '{}'::text[] Indexes: "parameters_idx" btree (run, opset_num, step_num, opset, opset_ver, step, step_ver, name, split, wafers) "parameters_opset_idx" btree (opset, step, name) "parameters_step_idx" btree (step, name) "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] count(*) performance
Does that mean that even though autovacuum is turned on, you still should do a regular vacuum analyze periodically? Thanks, Brendan Duddridge | CTO | 403-277-5591 x24 | [EMAIL PROTECTED] ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Mar 27, 2006, at 11:14 AM, Luke Lonergan wrote: Gabriel, On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote: That gave me an idea. I thought that autovacuum is doing it right, but I issued a vacuum full analyze verbose , and it worked all the day. After that I've tweaked memory settings a bit too (more fsm_pages) Oops! I replied to your disk speed before I saw this. The only thing is - you probably don't want to do a "vacuum full", but rather a simple "vacuum" should be enough. - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] count(*) performance
Gábriel Ákos wrote: Luke Lonergan wrote: Gabriel, On 3/27/06 10:05 AM, "Gábriel Ákos" <[EMAIL PROTECTED]> wrote: That gave me an idea. I thought that autovacuum is doing it right, but I issued a vacuum full analyze verbose , and it worked all the day. After that I've tweaked memory settings a bit too (more fsm_pages) Oops! I replied to your disk speed before I saw this. The only thing is - you probably don't want to do a "vacuum full", but rather a simple "vacuum" should be enough. I thought that too. Autovacuum is running on our system but it didn't do the trick. Anyway the issue is solved, thank you all for helping. :) Yeah, it would be nice of autovacuum had some way of raising a flag to the admin that given current settings (thresholds, FSM etc...), it's not keeping up with the activity. I don't know how to do this, but I hope someone else has some good ideas. Matt ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] count(*) performance
Brendan Duddridge wrote: > Does that mean that even though autovacuum is turned on, you still > should do a regular vacuum analyze periodically? No, it probably means you have set FSM settings too low, or not tuned the autovacuum parameters to your specific situation. A bug in the autovacuum daemon is not unexpected however, so if it doesn't work after tuning, let us know. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] count(*) performance
On 27.03.2006, at 21:20 Uhr, Brendan Duddridge wrote: Does that mean that even though autovacuum is turned on, you still should do a regular vacuum analyze periodically? It seems that there are situations where autovacuum does not a really good job. However, in our application I have made stupid design decision which I want to change as soon as possible. I have a "visit count" column in one of the very large tables, so updates are VERY regular. I've just checked and saw that autovacuum does a great job with that. Nevertheless I have set up a cron job to do a standard vacuum every month. I've used vacuum full only once after I did a bulk update of about 200.000 rows ... cug -- PharmaLine, Essen, GERMANY Software and Database Development smime.p7s Description: S/MIME cryptographic signature
[PERFORM] Large Binary Objects Middleware
Hello to all on the list. I have developed a product that sits between the database and an application that handles the storage of large binary data. The system is fast, but I'm feeling bad as to think that I have completely reinvented the weel on this case. You see, the engine does just stores the large data in "containers" that are directly on the filesystem instead of going to the database directly (since some of this list's members told me it would make the database really slow to store the data directly). So now I have a huge dilema as to continue this reinvention or use direct large objects. The database is holding large ammounts of digital video, and I am wanting to put these directly into the database. What performance guidelines would you all give seeing my position today? Thanks for all your speed-up tips, Rodrigo ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] count(*) performance
This is where a "last_vacuumed" (and "last_analyzed") column in pg_statistic(?) would come in handy. Each time vacuum or analyze has finished, update the row for the specific table that was vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed column. No more guessing "maybe I haven't vacuumed/analyzed in a while", and each time a user complains about bad performance, one could request the user to do a "select s.last_vacuumed, s.last_analyzed from pg_statistic s, pg_attribute a, pg_class c where ..." It SOUNDS easy to implement, but that has fooled me before... :-) - Mikael -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Guido Neitzer Sent: den 27 mars 2006 21:44 To: Brendan Duddridge Cc: Postgresql Performance Subject: Re: [PERFORM] count(*) performance On 27.03.2006, at 21:20 Uhr, Brendan Duddridge wrote: > Does that mean that even though autovacuum is turned on, you still > should do a regular vacuum analyze periodically? It seems that there are situations where autovacuum does not a really good job. However, in our application I have made stupid design decision which I want to change as soon as possible. I have a "visit count" column in one of the very large tables, so updates are VERY regular. I've just checked and saw that autovacuum does a great job with that. Nevertheless I have set up a cron job to do a standard vacuum every month. I've used vacuum full only once after I did a bulk update of about 200.000 rows ... cug -- PharmaLine, Essen, GERMANY Software and Database Development ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] count(*) performance
Mikael Carneholm wrote: This is where a "last_vacuumed" (and "last_analyzed") column in pg_statistic(?) would come in handy. Each time vacuum or analyze has finished, update the row for the specific table that was vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed column. No more guessing "maybe I haven't vacuumed/analyzed in a while", and each time a user complains about bad performance, one could request the user to do a "select s.last_vacuumed, s.last_analyzed from pg_statistic s, pg_attribute a, pg_class c where ..." It SOUNDS easy to implement, but that has fooled me before... :-) It is fairly easy to implement, however it has been discussed before and decided that it wasn't necessary. What the system cares about is how long it's been since the last vacuum in terms of XIDs not time. Storing a timestamp would make it more human readable, but I'm not sure the powers that be want to add two new columns to some system table to accommodate this. Matt ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] count(*) performance
"Matthew T. O'Connor" writes: > It is fairly easy to implement, however it has been discussed before and > decided that it wasn't necessary. What the system cares about is how > long it's been since the last vacuum in terms of XIDs not time. I think Alvaro is intending to do the latter (store per-table vacuum xid info) for 8.2. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] count(*) performance
I think it is definitely necessary from an administration point of view - as an administrator, I want to know: 1) Are there any stats (at all) in a schema 2) Are there any stats on the table that slow_query_foo is targeting 3) If I have stats, how recent are they 4) Could it be that there are a lot of dead tuples lying around (given the amount of traffic I know I have) These would be (are always!) the first questions I ask myself when I'm about to identify performance problems in an app, don't know how other people do though :) Maybe something I'll try to look into this weekend, if I can spare some time. - Mikael -Original Message- From: Matthew T. O'Connor [mailto:[EMAIL PROTECTED] Sent: den 28 mars 2006 00:43 To: Mikael Carneholm Cc: Postgresql Performance Subject: Re: [PERFORM] count(*) performance Mikael Carneholm wrote: > This is where a "last_vacuumed" (and "last_analyzed") column in > pg_statistic(?) would come in handy. Each time vacuum or analyze has > finished, update the row for the specific table that was > vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed > column. No more guessing "maybe I haven't vacuumed/analyzed in a while", > and each time a user complains about bad performance, one could request > the user to do a "select s.last_vacuumed, s.last_analyzed from > pg_statistic s, pg_attribute a, pg_class c where ..." > > It SOUNDS easy to implement, but that has fooled me before... :-) It is fairly easy to implement, however it has been discussed before and decided that it wasn't necessary. What the system cares about is how long it's been since the last vacuum in terms of XIDs not time. Storing a timestamp would make it more human readable, but I'm not sure the powers that be want to add two new columns to some system table to accommodate this. Matt ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq