Re: [PERFORM] Need suggestions on kernel settings for dedicated FreeBSD/Postgresql machine

2009-08-09 Thread Culley Harrelson
I will definitely look into this.  I suspect I need to tune my kernel
settings first though...

culley

On Sat, Aug 8, 2009 at 8:40 PM, Robert Haas wrote:
> On Fri, Aug 7, 2009 at 5:24 PM, Culley Harrelson wrote:
>> Hi Everyone,
>>
>> I manage a freeBSD server that is dedicated to postgresql.  The
>> machine has 4 gigs of ram and there is a single database powering a
>> web application that is hosted on a neighboring machine.  The web
>> application is mostly reading the database but there are considerable
>> writes and I don't want to tune the machine exclusively for writes.  I
>> realize more information would be needed to optimally tune the machine
>> but I am seeking advice on making some sane kernel settings for a
>> general purpose database on a dedicated system.  Currently I have:
>>
>> $ cat /etc/sysctl.conf
>>
>> kern.ipc.shmmax=268435456
>> kern.ipc.shmall=65536
>>
>> and
>>
>> $ cat /boot/loader.conf
>> kern.ipc.semmni="256"
>> kern.ipc.semmns="512"
>> kern.ipc.semmnu="256"
>>
>> In postgresql.conf I have:
>>
>> max_connections = 180
>> shared_buffers = 28MB
>>
>> I would like to increase this to 256 connections and make sure the
>> kernel settings are giving postgresql enough breathing room without.
>> I suspect my settings are conservative and since the machine is
>> dedicated to postgresql I would like to give it more resources if they
>> could be used.  Any suggestions?
>
> This might be worth a look, for starters.
>
> http://pgfoundry.org/projects/pgtune/
>
> ...Robert
>

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] ORDER BY ... LIMIT and JOIN

2009-08-09 Thread Fizu
On Sat, Aug 8, 2009 at 2:09 PM, Michael Andreen wrote:
> The planner is expecting one user with country_id = 1, but instead there are
> 57309. Have you analyzed recently? Maybe increasing the statistics target will
> help.
>
> /Michael


Just after analyze user and ranking it still taking so long to order
by an indexed field.

# explain analyze SELECT * FROM "ranking" INNER JOIN "user" ON
("ranking"."username" = "user"."username") WHERE "user"."country_id" =
5 ORDER BY "ranking"."ranking" ASC LIMIT 100;

 QUERY PLAN
--
 Limit  (cost=15340.13..15340.38 rows=100 width=178) (actual
time=4955.795..4955.865 rows=100 loops=1)
   ->  Sort  (cost=15340.13..15343.69 rows=1425 width=178) (actual
time=4955.794..4955.820 rows=100 loops=1)
 Sort Key: ranking.ranking
 Sort Method:  top-N heapsort  Memory: 56kB
 ->  Nested Loop  (cost=0.00..15285.67 rows=1425 width=178)
(actual time=20.951..4952.337 rows=1972 loops=1)
   ->  Index Scan using country_ranking_user_idx on "user"
 (cost=0.00..4807.25 rows=1710 width=143) (actual
time=20.923..4898.931 rows=1972 loops=1)
 Index Cond: (country_id = 5)
   ->  Index Scan using ranking_tmp_pkey on ranking
(cost=0.00..6.12 rows=1 width=35) (actual time=0.024..0.025 rows=1
loops=1972)
 Index Cond: ((ranking.username)::text =
("user".username)::text)
 Total runtime: 4955.974 ms
(10 rows)

# explain analyze SELECT * FROM "ranking" INNER JOIN "user" ON
("ranking"."username" = "user"."username") ORDER BY
"ranking"."ranking" ASC LIMIT 100;

QUERY PLAN
---
 Limit  (cost=0.00..136.78 rows=100 width=178) (actual
time=0.058..1.870 rows=100 loops=1)
   ->  Nested Loop  (cost=0.00..3116910.51 rows=2278849 width=178)
(actual time=0.056..1.818 rows=100 loops=1)
 ->  Index Scan using idxrank_6224 on ranking
(cost=0.00..71682.17 rows=2278849 width=35) (actual time=0.022..0.065
rows=100 loops=1)
 ->  Index Scan using user_pkey on "user"  (cost=0.00..1.32
rows=1 width=143) (actual time=0.015..0.016 rows=1 loops=100)
   Index Cond: (("user".username)::text = (ranking.username)::text)
 Total runtime: 1.946 ms
(6 rows)


Thank you!
M

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] ORDER BY ... LIMIT and JOIN

2009-08-09 Thread Michael Andreen
On Sunday 09 August 2009 21:26:08 Fizu wrote:
>->  Index Scan using country_ranking_user_idx on "user"
>  (cost=0.00..4807.25 rows=1710 width=143) (actual
> time=20.923..4898.931 rows=1972 loops=1)
>  Index Cond: (country_id = 5)

The statistics looks good now, but almost all the time is still spent on 
fetching users with country_id = 5. The actual ordering is only a tiny part of 
the full cost. Why it takes time probably depends on your hardware in relation 
to database size. I guess the database doesn't fit in ram? What settings have 
you changed?

Clustering users on country_ranking_user_idx would probably help for this 
specific case, but if it is a good idea depends on what other queries need to 
be fast. If the table or indexes are bloated then clustering on any index or 
doing reindex might do it.

/Michael

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] ORDER BY ... LIMIT and JOIN

2009-08-09 Thread Robert Haas
On Sun, Aug 9, 2009 at 3:26 PM, Fizu wrote:
>               ->  Index Scan using country_ranking_user_idx on "user"
>  (cost=0.00..4807.25 rows=1710 width=143) (actual
> time=20.923..4898.931 rows=1972 loops=1)
>                     Index Cond: (country_id = 5)

An index scan that picks up 1972 rows is taking 5 seconds?  I think
there must be something wrong with this index.  Is it possible that
since you apparently weren't analyzing this database, that maybe you
didn't vacuum it either?  If so, you should probably do a VACUUM FULL
on your database and then a database-wide REINDEX, but at a minimum
you should try reindexing this particular index.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bottleneck?

2009-08-09 Thread Ip Wing Kin John
Hi Scott,

Thanks for you suggestion. I have follow your suggestion by disable
nestloop and have a substantial improvement. Takes 51s now. I have
attached the new query plan in another file.

What I want to ask is, is there any other way to hint the planner to
choose to use merge join rather than nested loop by modifying my SQL?
I did try to sort my second inner join by the join condition, but the
planner still prefer to use nested loop.

As I am afraid changing the system wide configuration will have some
side effect on my other queries.

Here is my SQL.

select * from dummymediastatus rec INNER JOIN ( SELECT volumeGUID ,
MAX(startDatetime) AS msdt FROM dummymediastatus INNER JOIN ( select *
from getcurrentguids(1249281281666,'hardware.volume',null,null) ) AS
cfg ON ( cfg.getcurrentguids = volumeGUID) WHERE startDatetime <=
1249281281666 GROUP BY volumeGUID ) AS rec2 ON (  rec.volumeGUID =
rec2.volumeGUID AND  rec.startDatetime = rec2.msdt ) where  (  ( 1>0
and 1>0 )  and  rec.acsGUID in ( SELECT * FROM resolve('acs0') ) )
order by rec.startDatetime DESC,rec.id DESC;

thanks




On Thu, Aug 6, 2009 at 5:03 PM, Scott Marlowe wrote:
> OK, two things.  First the row estimate starts going way off around
> the time it gets to the hash aggregate / nested loop which seems to be
> making the planner use a bad plan for this many rows.  You can try
> issuing
>
> set enable_nestloop = off;
>
> before running the query and see if that makes it any faster.
>
> Secondly, the first time you run this query you are reading the 1.8G
> table sequentially, and at about 55MB/s, which isn't gonna get faster
> without more / faster drives under your machine.
>
> On Thu, Aug 6, 2009 at 12:50 AM, Ip Wing Kin John wrote:
>> Here u go. Both in the same file.
>>
>> On Thu, Aug 6, 2009 at 4:48 PM, Scott Marlowe wrote:
>>> Much better... Looks like I got the second one...
>>>
>>> Can I get the first one too?  Thx.
>>>
>>> On Thu, Aug 6, 2009 at 12:46 AM, Ip Wing Kin John wrote:
 Hope you can get it this time.

 John

 On Thu, Aug 6, 2009 at 4:34 PM, Scott Marlowe 
 wrote:
> Sorry man, it's not coming through.  Try it this time addressed just to 
> me.
>
> On Thu, Aug 6, 2009 at 12:23 AM, Ip Wing Kin John 
> wrote:
>> Hi scott
>>
>> I attached the query plan with this email. The top one is the first
>> run after I restarted my machine. And the bottom one is the second
>> run.
>>
>> I am using PostgreSQL 8.3 on Solaris 10.
>>
>> cheers
>>
>> On Thu, Aug 6, 2009 at 4:15 PM, Scott Marlowe 
>> wrote:
>>> On Wed, Aug 5, 2009 at 11:21 PM,  wrote:
 Sorry post again.
>>>
>>> Nope, still mangled.  Can you attach it?
>>>
>>
>>
>>
>> --
>> John
>>
>
>
>
> --
> When fascism comes to America, it will be intolerance sold as diversity.
>



 --
 John

>>>
>>>
>>>
>>> --
>>> When fascism comes to America, it will be intolerance sold as diversity.
>>>
>>
>>
>>
>> --
>> John
>>
>
>
>
> --
> When fascism comes to America, it will be intolerance sold as diversity.
>



-- 
John


 QUERY PLAN

 Sort  (cost=710118.74..710118.75 rows=3 width=567) (actual 
time=51563.029..51580.020 rows=80963 loops=1)
   Sort Key: rec.startdatetime, rec.id
   Sort Method:  quicksort  Memory: 43163kB
   ->  Hash Join  (cost=360922.21..710118.72 rows=3 width=567) (actual 
time=43014.876..51132.786 rows=80963 loops=1)
 Hash Cond: ((rec.acsguid)::text = (resolve.resolve)::text)
 ->  Merge Join  (cost=360655.21..709851.67 rows=3 width=567) (actual 
time=42458.324..50434.884 rows=80963 loops=1)
   Merge Cond: (rec.startdatetime = 
(max(dummymediastatus.startdatetime)))
   Join Filter: ((rec.volumeguid)::text = 
(dummymediastatus.volumeguid)::text)
   ->  Index Scan using index_dummymediastatus_startdatetime on 
dummymediastatus rec  (cost=0.00..339020.12 rows=4000362 width=414) (actual 
time=41.617..6324.895 rows=352 loops=1)
   ->  Sort  (cost=360655.21..360664.23 rows=3608 width=153) 
(actual time=42416.687..42453.669 rows=81934 loops=1)
 Sort Key: (max(dummymediastatus.startdatetime))
 Sort Method:  quicksort  Memory: 5174kB
 ->  HashAggregate  (cost=360360.86..360405.96 rows=3608 
width=16) (actual time=42257.696..42309.261 rows=8 loops=1)
   ->  Hash Join  (cost=335135.05..354817.67 
rows=1108637 width=16) (actual time=37252.925..39518.267 rows=400 loops=1)
 Hash Cond: 
((getcurrentguids.getcurrentguids)::text = (dummymed