Specs:
7.3.6 machine Dual Athlon MP 2 GIG of ram, 4 Drive IDE (3ware) RAID 10 OS FC1 with 2.4 kernel
7.4.6 machine Dual Opteron MP (64bit PostgreSQL), 2 Gig of Ram 10 Drive RAID 10 with 128 Meg battery backed cache (3WARE). OS FC3 x86_64 with 2.6 kernel
Essentials parameters:
7.3.6:
shared_buffers = 8192
wal_buffers = 2048
sort_mem = 4096 checkpoint_segments = 25
effective_cache_size = 65536
random_page_cost = 1.5
statistics_target = 150
7.4.6 Same as above except 8192 sort mem and 50 checkpoint segments
Both are running fsync with open_sync
Both have been vacuumed and analyze repeatedly while trying to figure this out.
Explain Analyzes:
7.3.6 (old)
Aggregate (cost=320.49..324.89 rows=7 width=338) (actual time=630.21..630.21 rows=1 loops=1) -> Group (cost=320.49..324.71 rows=70 width=338) (actual time=447.98..623.91 rows=8845 loops=1) -> Sort (cost=320.49..320.67 rows=70 width=338) (actual time=447.95..460.77 rows=8845 loops=1) Sort Key: p.post_id, t.topic_id, t.topic_title, t.topic_status, t.topic_replies, t.topic_time, t.topic_type, t.topic_vote, t.topic_last_post_id, f.forum_name, f.forum_status, f.forum_id, f.auth_view, f.auth_read, f.auth_post, f.auth_reply, f.auth_edit, f.auth_delete, f.auth_sticky, f.auth_announce, f.auth_pollcreate, f.auth_vote, f.auth_attachments -> Nested Loop (cost=0.00..318.33 rows=70 width=338) (actual time=0.41..109.19 rows=8845 loops=1) -> Nested Loop (cost=0.00..17.55 rows=1 width=330) (actual time=0.11..0.34 rows=1 loops=1) -> Nested Loop (cost=0.00..6.75 rows=1 width=291) (actual time=0.08..0.12 rows=1 loops=1) -> Index Scan using nuke_bbposts_pkey on nuke_bbposts p (cost=0.00..3.21 rows=1 width=8) (actual time=0.04..0.05 rows=1 loops=1) Index Cond: (post_id = 352888) -> Index Scan using nuke_bbtopics_pkey on nuke_bbtopics t (cost=0.00..3.53 rows=1 width=283) (actual time=0.02..0.05 rows=1 loops=1) Index Cond: (t.topic_id = "outer".topic_id) -> Index Scan using nuke_bbforums_pkey on nuke_bbforums f (cost=0.00..10.78 rows=1 width=39) (actual time=0.02..0.21 rows=1 loops=1) Index Cond: (f.forum_id = "outer".forum_id) -> Index Scan using topic_id_nuke_bbposts_index on nuke_bbposts p2 (cost=0.00..299.33 rows=117 width=8) (actual time=0.29..31.68 rows=8845 loops=1) Index Cond: (p2.topic_id = "outer".topic_id) Filter: (post_id <= 352888) Total runtime: 633.72 msec (17 rows)
7.4.6 (new)
GroupAggregate (cost=209.11..213.73 rows=71 width=328) (actual time=3701.837..3701.837 rows=1 loops=1) -> Sort (cost=209.11..209.29 rows=71 width=328) (actual time=2725.518..2728.590 rows=8845 loops=1) Sort Key: p.post_id, t.topic_id, t.topic_title, t.topic_status, t.topic_replies, t.topic_time, t.topic_type, t.topic_vote, t.topic_last_post_id, f.forum_name, f.forum_status, f.forum_id, f.auth_view, f.auth_read, f.auth_post, f.auth_reply, f.auth_edit, f.auth_delete, f.auth_sticky, f.auth_announce, f.auth_pollcreate, f.auth_vote, f.auth_attachments -> Nested Loop (cost=0.00..206.93 rows=71 width=328) (actual time=0.621..190.112 rows=8845 loops=1) -> Nested Loop (cost=0.00..9.04 rows=1 width=328) (actual time=0.347..0.365 rows=1 loops=1) -> Nested Loop (cost=0.00..6.04 rows=1 width=291) (actual time=0.298..0.307 rows=1 loops=1) -> Index Scan using nuke_bbposts_pkey on nuke_bbposts p (cost=0.00..3.01 rows=1 width=8) (actual time=0.209..0.212 rows=1 loops=1) Index Cond: (post_id = 352888) -> Index Scan using nuke_bbtopics_pkey on nuke_bbtopics t (cost=0.00..3.01 rows=1 width=283) (actual time=0.069..0.073 rows=1 loops=1) Index Cond: (t.topic_id = "outer".topic_id) -> Index Scan using nuke_bbforums_pkey on nuke_bbforums f (cost=0.00..2.99 rows=1 width=39) (actual time=0.028..0.035 rows=1 loops=1) Index Cond: (f.forum_id = "outer".forum_id) -> Index Scan using topic_id_nuke_bbposts_index on nuke_bbposts p2 (cost=0.00..196.46 rows=114 width=8) (actual time=0.256..95.501 rows=8845 loops=1) Index Cond: (p2.topic_id = "outer".topic_id) Filter: (post_id <= 352888) Total runtime: 3728.376 ms (16 rows)
If you look at the second line in each explain it is the sort that is causing the grief. On 7.3.6 it only takes say 447ms (on an completely unused machine), on the Opteron it takes 2725.518.
The query on the opteron even after a fresh restart of apache and PostgreSQL takes at least 1100 ms.
Other 7.4.6 information:
[EMAIL PROTECTED] contrib]# mpstat Linux 2.6.10-1.770_FC3smp (www.radioparadise.com) 03/19/2005
10:05:59 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s
10:05:59 AM all 4.48 0.00 0.79 2.96 0.01 0.04 91.72 1112.02
[EMAIL PROTECTED] contrib]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 3560 230524 31196 1599572 0 0 10 189 3 3 4 1 92 3
[EMAIL PROTECTED] contrib]# iostat Linux 2.6.10-1.770_FC3smp (www.radioparadise.com) 03/19/2005
avg-cpu: %user %nice %sys %iowait %idle 4.48 0.00 0.84 2.96 91.72
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 29.72 38.28 757.12 49022777 969575882
The database is identical in the sense of schema (direct dump from 7.3.6 to 7.4.6).
Both were initalized with initdb --no-locale .
Any ideas?
Sincerely,
Joshua D. Drake
-- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
begin:vcard fn:Joshua Drake n:Drake;Joshua org:Command Prompt, Inc. adr:;;PO Box 215 ;Cascade Locks;OR;97014;US email;internet:[EMAIL PROTECTED] title:Consultant tel;work:503-667-4564 tel;fax:503-210-0334 x-mozilla-html:FALSE url:http://www.commandprompt.com version:2.1 end:vcard
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend