date:20060809

Re: [PERFORM] most bang for buck with ~ $20,000

2006-08-09 Thread Arjen van der Meijden

We were in a similar situation with a similar budget. But we had two 
requirements, no "deprecated" scsi while the successor SAS is available 
and preferrably only 3 or 4U of rack space. And it had to have 
reasonable amounts of disks (at least 12).


The two options we finally choose between where a Dell 1U 1950 with two 
woodcrests 5160 (I don't think the older dempsey 50x0's are a good idea) 
and 16GB of memory combined with a PowerVault MD1000 external storage 
SAS JBOD unit, with 15 36GB 15k rpm disks and from HP a similar 
configured DL360G5 (also 1U) combined with two MSA50 SFF SAS JBOD 
enclosures with 20 36GB 10k rpm SFF disks.


Both enclosures offer has SAS-connectivity (serial attached scsi), i.e. 
the "next generation scsi". Which is supposed to be the successor to 
scsi, but unfortunately its not yet as widely available.


The Dell MD1000 is 3U high and can be fitted with 15 3.5" disks, the 
MSA50 is 1U and can be fitted with 10 2.5" disks.
In terms of performance you'll likely need two MSA50's to be up to par 
with one MD1000. The SFF disks are about as expensive as the 15k 3.5" 
disks... so its mostly interesting for packing a lot of I/O in a small 
enclosure. HP is going to offer a 3.5" SAS-enclosure (MSA60) but that 
one won't be available until Q1 2007 or something like that.
As said Promise and Adaptec also offer SAS enclosures, both are 2U and 
can be fitted with 12 disks. There are more available, but they are 
generally quite bit hard to find.


Good luck with your search.

Best regards,

Arjen


Kenji Morishige wrote:

I have unlimited rack space, so 2U is not the issue. The boxes are stored in
our lab for internal software tools.  I'm going to research those boxes you
mention.  Regarding the JBOD enclosures, are these generally just 2U or 4U
units with SCSI interface connectors?  I didn't see these types of boxes
availble on Dell website, I'll look again.
-Kenji

On Wed, Aug 09, 2006 at 07:35:22AM +0200, Arjen van der Meijden wrote:

With such a budget you should easily be able to get something like:
- A 1U high-performance server (for instance the Dell 1950 with 2x 
Woodcrest 5160, 16GB of FB-Dimm memory, one 5i and one 5e perc raid 
controller and some disks internally)
- An external SAS direct attached disks storage enclosure full with 15k 
rpm 36GB disks (for instance the MD1000, with 15x 36GB 15k disks)


Going for the dell-solution would set you back "only" (including 
savings) about $13-$14k. HP offers a similar solutions (a HP DL360G5 or 
a DL380G5/DL385 with two MSA50's for instance) which also fit in your 
budget afaik. The other players tend to be (a bit) more expensive, force 
you to go with Fibre Channel or "ancient" SCSI external storage ;)


If you'd like to have a product by a generic vendor, have a look at the 
Adaptec JS50 SAS Jbod enclosure or Promise's Vtrak 300 (both offer 12 
sas/sata bays in 2U) for storage.


If you're limited to only 2U of rack space, its a bit more difficult to 
get maximum I/O in your budget (you have basically space for about 8 or 
12 3.5" disks (with generic suppliers) or 16 2.5" sff disks (with HP)).
But you should still be able to have two top-off-the-line x86 cpu's (amd 
opteron 285 or intel woorcrest 5160) and 16GB of memory (even FB Dimm, 
which is pretty expensive).


Best regards,

Arjen van der Meijden


On 8-8-2006 22:43, Kenji Morishige wrote:

I've asked for some help here a few months ago and got some really helpfull
answers regarding RAID controllers and server configuration.  Up until
recently I've been running PostgreSQL on a two year old Dual Xeon 3.06Ghz
machine with a single channel RAID controller (previously Adaptec 2200S, 
but
now changed to LSI MegaRAID). The 2U unit is from a generic vendor using 
what
I believe is a SuperMicro motherboard.  In the last week after upgrading 
the

RAID controller, the machine has had disk failure and some other issues. I
would like to build a very reliable dedicated postgreSQL server that has 
the
ultimate possible performance and reliabily for around $20,000.  The data 
set

size is only currently about 4GB, but is increasing by approximately 50MB
daily.  The server also requires about 500 connections and I have been
monitoring about 100-200 queries per second at the moment.  I am planning 
to

run FreeBSD 6.1 if possible, but I am open to any other suggestions if it
improves performance.

I am considering a setup such as this:
 - At least dual cpu (possibly with 2 cores each)
 - 4GB of RAM
 - 2 disk RAID 1 array for root disk
 - 4 disk RAID 1+0 array for PGDATA
 - 2 disk RAID 1 array for pg_xlog

Does anyone know a vendor that might be able provide such setup?  Any
critique in this design? I'm thinking having a 2 channel RAID controller to
seperate the PGDATA, root and pg_xlog.

Sincerely,
Kenji

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster





---(end of broadcast)-

[PERFORM] unsubscribe

2006-08-09 Thread Gourish Singbal

-- Forwarded message --From: Gourish Singbal <[EMAIL PROTECTED]>Date: Aug 9, 2006 12:24 PM
Subject: unsubscribeTo: "pgsql-admin@postgresql.org" 
-- Best, 
Gourish Singbal -- Best,Gourish Singbal

[PERFORM] Unsubscribe

2006-08-09 Thread Werner vd Merwe

Unsubscribe

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.394 / Virus Database: 268.10.8/413 - Release Date: 2006/08/08
 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Hardware upgraded but performance still ain't good enough

2006-08-09 Thread Stephen Frost

* David Lang ([EMAIL PROTECTED]) wrote:
> there's a huge difference between 'works on debian' and 'supported on 
> debian'. I do use debian extensivly, (along with slackware on my personal 
> machines), so i am comfortable getting things to work. but 'supported' 
> means that when you run into a problem you can call for help without being 
> told 'sorry, switch distros, then call us back'.

Have you ever actually had that happen?  I havn't and I've called
support for a number of different issues for various commercial
software.  In the end it might boil down to some distribution-specific
issue that they're not willing to fix but honestly that's pretty rare.

> even many of the companies that offer support for postgres have this 
> problem. the explination is always that they can't test every distro out 
> there so they pick a few and support those (this is one of the reasons why 

My experience has been that unless it's pretty clearly some
distro-specific issue (which doesn't happen all that often, but it's
good to be familiar with what would probably be a distro-specific issue
and what wouldn't), the support folks are willing to help debug it.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [PERFORM] shared_buffer optimization

2006-08-09 Thread Ruben Rubio

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

So ...
I have tried different values. The best one for one day sentences seems
to be 24576

IO in vmstat has the lowest values
"id" (idle) has the biggest values.

I have created an script that executes all day sentences to try that.

By the way, could u explain a little bit this?

"The "rule of thumb" is to set shared buffers to the lesser of 1
 and 15% of system memory.  In your case, that would be the lesser of
 1 and 78643, which is 1."

Im sorry but im not understanding it at all.

Thanks in advance.


Christopher Browne wrote:
> Quoth [EMAIL PROTECTED] (Ruben Rubio):
>> Hi, I have a question with shared_buffer.
>>
>> Ok, I have a server with 4GB of RAM
>> -
>> # cat /proc/meminfo
>> MemTotal:  4086484 kB
>> [...]
>> -
>>
>> So, if I want to, for example, shared_buffer to take 3 GB of RAM then
>> shared_buffer would be 393216 (3 * 1024 * 1024 / 8)
>>
>> Postmaster dont start.
>> Error: FATAL:  shmat(id=360448) failed: Invalid argument
>>
>>
>> I can set a less value, but not higher than 3 GB.
>>
>> Am I doing something wrong?
>> Any idea?
> 
> Yes, you're trying to set the value way too high.
> 
> The "rule of thumb" is to set shared buffers to the lesser of 1
> and 15% of system memory.  In your case, that would be the lesser of
> 1 and 78643, which is 1.
> 
> I'm not aware of any actual evidence having emerged that it is of any
> value to set shared buffers higher than 1.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2b3kIo1XmbAXRboRAh12AKCodmhmXZWamrG7MnAf9mhVfubjgwCfa75v
7bgmSzq4F7XpBoEkSpyDqnE=
=3lMc
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Optimizing queries

2006-08-09 Thread Patrice Beliveau


Tom Lane wrote:

Patrice Beliveau <[EMAIL PROTECTED]> writes:
  

SELECT * FROM TABLE
WHERE TABLE.COLUMN1=something
AND TABLE.COLUMN2=somethingelse
AND function(TABLE.COLUMN3,TABLE.COLUMN4) > 0;



  
I find out that the function process every row even if the row should be 
rejected as per the first or the second condition.

... I'm using version 8.1.3



PG 8.1 will not reorder WHERE clauses for a single table unless it has
some specific reason to do so (and AFAICT no version back to 7.0 or so
has done so either...)  So there's something you are not telling us that
is relevant.  Let's see the exact table schema (psql \d output is good),
the exact query, and EXPLAIN output for that query.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


  

Hi,

here is my query, and the query plan that result

explain select * from (
  select * from sales_order_delivery
  where sales_order_id in (
  select sales_order_id from sales_order
  where closed=false
  )
) as a where outstandingorder(sales_order_id, sales_order_item, 
date_due) > 0;



 QUERY PLAN
--
Hash IN Join  (cost=498.89..8348.38 rows=34612 width=262)
  Hash Cond: (("outer".sales_order_id)::text = 
("inner".sales_order_id)::text)
  ->  Seq Scan on sales_order_delivery  (cost=0.00..6465.03 rows=69223 
width=262)
Filter: (outstandingorder((sales_order_id)::text, 
(sales_order_item)::text, date_due) > 0::double precision)

  ->  Hash  (cost=484.90..484.90 rows=5595 width=32)
->  Seq Scan on sales_order  (cost=0.00..484.90 rows=5595 width=32)
  Filter: (NOT closed)
(7 rows)


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Optimizing queries

2006-08-09 Thread Ruben Rubio

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

If subquerys are not working I think you should try to create a view
with the subquery.

Maybe it will work.

Patrice Beliveau wrote:
> Tom Lane wrote:
>> Patrice Beliveau <[EMAIL PROTECTED]> writes:
>>  
> SELECT * FROM TABLE
> WHERE TABLE.COLUMN1=something
> AND TABLE.COLUMN2=somethingelse
> AND function(TABLE.COLUMN3,TABLE.COLUMN4) > 0;
> 
>>
>>  
>>> I find out that the function process every row even if the row should
>>> be rejected as per the first or the second condition.
>>> ... I'm using version 8.1.3
>>> 
>>
>> PG 8.1 will not reorder WHERE clauses for a single table unless it has
>> some specific reason to do so (and AFAICT no version back to 7.0 or so
>> has done so either...)  So there's something you are not telling us that
>> is relevant.  Let's see the exact table schema (psql \d output is good),
>> the exact query, and EXPLAIN output for that query.
>>
>> regards, tom lane
>>
>> ---(end of broadcast)---
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>choose an index scan if your joining column's datatypes do not
>>match
>>
>>
>>   
> Hi,
> 
> here is my query, and the query plan that result
> 
> explain select * from (
>   select * from sales_order_delivery
>   where sales_order_id in (
>   select sales_order_id from sales_order
>   where closed=false
>   )
> ) as a where outstandingorder(sales_order_id, sales_order_item,
> date_due) > 0;
> 
> 
>  QUERY PLAN
> --
> 
> Hash IN Join  (cost=498.89..8348.38 rows=34612 width=262)
>   Hash Cond: (("outer".sales_order_id)::text =
> ("inner".sales_order_id)::text)
>   ->  Seq Scan on sales_order_delivery  (cost=0.00..6465.03 rows=69223
> width=262)
> Filter: (outstandingorder((sales_order_id)::text,
> (sales_order_item)::text, date_due) > 0::double precision)
>   ->  Hash  (cost=484.90..484.90 rows=5595 width=32)
> ->  Seq Scan on sales_order  (cost=0.00..484.90 rows=5595 width=32)
>   Filter: (NOT closed)
> (7 rows)
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>   http://www.postgresql.org/docs/faq
> 
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2dMTIo1XmbAXRboRAhbIAJwJGZ+ITP0gl38A3qROrzIeNbTtUwCcDOIW
eZ9NJqjL+58gyMfO95jwZSw=
=4Zxj
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread Joshua D. Drake


Alex Turner wrote:

First off - very few third party tools support debian.  Debian is a sure
fire way to have an unsupported system.  Use RedHat or SuSe (flame me all
you want, it doesn't make it less true).


*cough* BS *cough*

Linux is Linux. It doesn't matter what trademark you put on top of it. 
As long as they are running a current version of Linux (e.g; kernel 2.6) 
they should be fine.


With Debian that may or may not be the case and that could be an issue.
To get the best luck, I would suggest (if you want to stay with a Debian 
base) Ubuntu Dapper LTS.


Sincerely,

Joshua D. Drake




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread Joshua D. Drake




Have you ever actually had that happen?  I havn't and I've called
support for a number of different issues for various commercial
software.  In the end it might boil down to some distribution-specific
issue that they're not willing to fix but honestly that's pretty rare.


Very rare, if you are using a reputable vendor.



even many of the companies that offer support for postgres have this 
problem. the explination is always that they can't test every distro out 
there so they pick a few and support those (this is one of the reasons why 


Ahh and which companies would these be? As a representative of the most 
prominent one in the  US I can tell you that you are not speaking from a 
knowledgeable position.


Sincerely,

Joshua D. Drake


--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] unsubscribe

2006-08-09 Thread Bill Moran

In response to "Gourish Singbal" <[EMAIL PROTECTED]>:

If you look in the mail headers:

List-Unsubscribe: 



-- 
Bill Moran
Collaborative Fusion Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Hardware upgraded but performance still ain't good enough

2006-08-09 Thread Merlin Moncure


On 8/7/06, Alvaro Nunes Melo <[EMAIL PROTECTED]> wrote:

"we recently upgraded our dual Xeon Dell to a brand new Sun v40z with 4
opterons, 16GB of memory and MegaRAID with enough disks. OS is Debian
Sarge amd64, PostgreSQL is 8.0.3." on
(http://archives.postgresql.org/pgsql-performance/2005-07/msg00431.php)


well, if you spend three months optimizing your application or buy a
10k$ server to get the same result, which is cheaper? :)


The database size stands around 10 GB. The new server has a better
performance than the old one, but sometimes it still stucks. We tried to
use a HP proprietary tool to monitor the server, and find out what is
the bottleneck, but it's been difficult to install it on Debian. The


I'm not familiar with the hp tool, but I suspect you are not missing
much.  If you are looking for a free distro, you might have some luck
with centos.  most redhat binary rpms will install on it.


tool is only certified for SuSe and RedHat. So we tried to use some
Linux tools to see what's going on, like vmstat and iostat. Are this
tools (vm and iostat) enough? Should we use something else? Is there any
specifical material about finding bottlenecks in Linux/PostgreSQL
machines? Is our disks design proper?


those are pretty broad questions, so you will only get broad answers.
you might want to consider hooking up with some commercial support
(I've heard good things about commandprompt) or providing more
detailed information so that you can get some help from this list,
including:
iostat/vmstat reports
explain analyze
information from top

nicely summarized at the time the problems occur.
regards,
merlin


I really apologize for my lack of knowledge in this area, and for the
excessive number of questions in a single e-mail.

Best regards,
Alvaro


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Optimizing queries

2006-08-09 Thread Patrice Beliveau

I've create a view, same query plan (some number vary a bit, but nothing 
significant) and same result, closed sales_order are processed


Ruben Rubio wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

If subquerys are not working I think you should try to create a view
with the subquery.

Maybe it will work.

Patrice Beliveau wrote:
  

Tom Lane wrote:


Patrice Beliveau <[EMAIL PROTECTED]> writes:
 
  

SELECT * FROM TABLE
WHERE TABLE.COLUMN1=something
AND TABLE.COLUMN2=somethingelse
AND function(TABLE.COLUMN3,TABLE.COLUMN4) > 0;


 
  

I find out that the function process every row even if the row should
be rejected as per the first or the second condition.
... I'm using version 8.1.3



PG 8.1 will not reorder WHERE clauses for a single table unless it has
some specific reason to do so (and AFAICT no version back to 7.0 or so
has done so either...)  So there's something you are not telling us that
is relevant.  Let's see the exact table schema (psql \d output is good),
the exact query, and EXPLAIN output for that query.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


  
  

Hi,

here is my query, and the query plan that result

explain select * from (
  select * from sales_order_delivery
  where sales_order_id in (
  select sales_order_id from sales_order
  where closed=false
  )
) as a where outstandingorder(sales_order_id, sales_order_item,
date_due) > 0;


 QUERY PLAN
--

Hash IN Join  (cost=498.89..8348.38 rows=34612 width=262)
  Hash Cond: (("outer".sales_order_id)::text =
("inner".sales_order_id)::text)
  ->  Seq Scan on sales_order_delivery  (cost=0.00..6465.03 rows=69223
width=262)
Filter: (outstandingorder((sales_order_id)::text,
(sales_order_item)::text, date_due) > 0::double precision)
  ->  Hash  (cost=484.90..484.90 rows=5595 width=32)
->  Seq Scan on sales_order  (cost=0.00..484.90 rows=5595 width=32)
  Filter: (NOT closed)
(7 rows)


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq





-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2dMTIo1XmbAXRboRAhbIAJwJGZ+ITP0gl38A3qROrzIeNbTtUwCcDOIW
eZ9NJqjL+58gyMfO95jwZSw=
=4Zxj
-END PGP SIGNATURE-


  



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[PERFORM] 3-table query optimization

2006-08-09 Thread Michal Taborsky - Internet Mall


Hello everyone.

My (simplified) database structure is:

a) table product (15 rows)
 product_id BIGINT PRIMARY KEY
 title TEXT
 ...

b) table action (5000 rows)
 action_id BIGINT PRIMARY KEY
 product_id BIGINT, FK to product
 shop_group_id INTEGER (there are about 5 groups, distributed about evenly)

c) table product_program (50 rows)
 program_id BIGINT (there are about 50 unique)
 product_id BIGINT, FK to product


I need to query products, which are in action table of specific group 
and in product_program for a specific program_id. The query is taking 
too long to my liking My query is:


SELECT product.product_id
  FROM action
  JOIN product ON (product.product_id=action.product_id)
 WHERE action.shop_group_id=1
   AND EXISTS (SELECT 1
 FROM catalog.product_program
 WHERE product_id=product.product_id
   AND product_program.program_id =1104322
   )


QUERY PLAN
Nested Loop  (cost=0.00..18073.81 rows=1220 width=8) (actual 
time=10.153..2705.891 rows=636 loops=1)
  ->  Seq Scan on "action"  (cost=0.00..135.74 rows=2439 width=8) 
(actual time=8.108..36.684 rows=2406 loops=1)

Filter: (shop_group_id = 1)
  ->  Index Scan using product_pkey on product  (cost=0.00..7.34 rows=1 
width=8) (actual time=1.031..1.097 rows=0 loops=2406)
Index Cond: ((product.product_id)::bigint = 
("outer".product_id)::bigint)

Filter: (subplan)
SubPlan
  ->  Index Scan using product_program_pkey on product_program 
 (cost=0.00..4.33 rows=1 width=0) (actual time=0.455..0.455 rows=0 
loops=2406)
Index Cond: (((program_id)::bigint = 1104322) AND 
((product_id)::bigint = ($0)::bigint))

Total runtime: 2708.575 ms



I also tried this:

SELECT product.product_id
  FROM action
  JOIN product ON (product.product_id=action.product_id)
  JOIN catalog.product_program ON (
product_program.product_id=product.product_id AND
product_program.program_id =1104322)
 WHERE action.shop_group_id=1


With about the same results (a bit better, but for different groups it 
was vice versa):


QUERY PLAN
Nested Loop  (cost=141.84..3494.91 rows=139 width=8) (actual 
time=118.584..1295.303 rows=636 loops=1)
  ->  Hash Join  (cost=141.84..2729.11 rows=253 width=16) (actual 
time=118.483..231.103 rows=636 loops=1)
Hash Cond: (("outer".product_id)::bigint = 
("inner".product_id)::bigint)
->  Index Scan using product_program_pkey on product_program 
(cost=0.00..2470.04 rows=7647 width=8) (actual time=0.047..73.514 
rows=7468 loops=1)

  Index Cond: ((program_id)::bigint = 1104322)
->  Hash  (cost=135.74..135.74 rows=2439 width=8) (actual 
time=118.114..118.114 rows=0 loops=1)
  ->  Seq Scan on "action"  (cost=0.00..135.74 rows=2439 
width=8) (actual time=0.019..106.864 rows=2406 loops=1)

Filter: (shop_group_id = 1)
  ->  Index Scan using product_pkey on product  (cost=0.00..3.01 rows=1 
width=8) (actual time=1.300..1.655 rows=1 loops=636)
Index Cond: (("outer".product_id)::bigint = 
(product.product_id)::bigint)



Any ideas if this is really the best I can expect, or is there something 
amiss there and my query is wrong for this type of task? My gut feeling 
tells me, that this kind of query should be a lot faster. The hardware 
is Dual Xeon with enough of RAM and other operations run just fine.


Thank you.

--
Michal Táborský

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Optimizing queries

2006-08-09 Thread Tom Lane

Patrice Beliveau <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> PG 8.1 will not reorder WHERE clauses for a single table unless it has
>> some specific reason to do so (and AFAICT no version back to 7.0 or so
>> has done so either...)  So there's something you are not telling us that
>> is relevant.

> here is my query, and the query plan that result

> explain select * from (
>select * from sales_order_delivery
>where sales_order_id in (
>select sales_order_id from sales_order
>where closed=false
>)
>  ) as a where outstandingorder(sales_order_id, sales_order_item, 
> date_due) > 0;

So this isn't a simple query, but a join.  PG will generally push
single-table restrictions down to the individual tables in order to
reduce the number of rows that have to be processed at the join.
In this case that's not a win, but the planner doesn't know enough
about the outstandingorder() function to realize that.

I think what you need is an "optimization fence" to prevent the subquery
from being flattened:

explain select * from (
   select * from sales_order_delivery
   where sales_order_id in (
   select sales_order_id from sales_order
   where closed=false
   )
   OFFSET 0
 ) as a where outstandingorder(sales_order_id, sales_order_item, 
date_due) > 0;

Any LIMIT or OFFSET in a subquery prevents WHERE conditions from being
pushed down past it (since that might change the results).  OFFSET 0 is
otherwise a no-op, so that's what people usually use.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] 3-table query optimization

2006-08-09 Thread Tom Lane

Michal Taborsky - Internet Mall <[EMAIL PROTECTED]> writes:
> SELECT product.product_id
>FROM action
>JOIN product ON (product.product_id=action.product_id)
>   WHERE action.shop_group_id=1
> AND EXISTS (SELECT 1
>   FROM catalog.product_program
>   WHERE product_id=product.product_id
> AND product_program.program_id =1104322
> )

Try converting the EXISTS subquery to an IN.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] most bang for buck with ~ $20,000

2006-08-09 Thread Scott Marlowe

On Tue, 2006-08-08 at 17:53, Thomas F. O'Connell wrote:
> On Aug 8, 2006, at 5:28 PM, Joshua D. Drake wrote:
> 
> > Thomas F. O'Connell wrote:
> >> On Aug 8, 2006, at 4:49 PM, Joshua D. Drake wrote:
>  I am considering a setup such as this:
>    - At least dual cpu (possibly with 2 cores each)
>    - 4GB of RAM
>    - 2 disk RAID 1 array for root disk
>    - 4 disk RAID 1+0 array for PGDATA
>    - 2 disk RAID 1 array for pg_xlog
>  Does anyone know a vendor that might be able provide such setup?
> >> Wouldn't it be preferable to put WAL on a multi-disk RAID 10 if  
> >> you had the opportunity? This gives you the redundancy of RAID 1  
> >> but approaches the performance of RAID 0, especially as you add  
> >> disks to the array. In benchmarking, I've seen consistent success  
> >> with this approach.
> >
> > WALL is written in order so RAID 1 is usually fine. We also don't  
> > need journaling for WAL so the speed is even faster.
> 
> In which case, which is theoretically better (since I don't have a  
> convenient test bed at the moment) for WAL in a write-heavy  
> environment? More disks in a RAID 10 (which should theoretically  
> improve write throughput in general, to a point) or a 2-disk RAID 1?  
> Does it become a price/performance question, or is there virtually no  
> benefit to throwing more disks at RAID 10 for WAL if you turn off  
> journaling on the filesystem?

Actually, the BIGGEST win comes when you've got battery backed cache on
your RAID controller.  In fact, I'd spend money on a separate RAID
controller for xlog with its own cache hitting a simple mirror set
before I'd spring for more drives on pg_xlog.  The battery backed cache
on the pg_xlog likely wouldn't need to be big, just there and set to
write-back.

Then put all the rest of your cash into disks on a big RAID 10 config,
and as big of a battery backed cache as you can afford for it and memory
for the machine.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread David Lang


On Wed, 9 Aug 2006, Stephen Frost wrote:


* David Lang ([EMAIL PROTECTED]) wrote:

there's a huge difference between 'works on debian' and 'supported on
debian'. I do use debian extensivly, (along with slackware on my personal
machines), so i am comfortable getting things to work. but 'supported'
means that when you run into a problem you can call for help without being
told 'sorry, switch distros, then call us back'.


Have you ever actually had that happen?  I havn't and I've called
support for a number of different issues for various commercial
software.  In the end it might boil down to some distribution-specific
issue that they're not willing to fix but honestly that's pretty rare.


unfortunantly I have, repeatedly with different products.

if you can manage to get past the first couple of levels of support to 
people who really understand things rather then just useing checklists you 
are more likly to get help, but even there I've run into people who seem 
eager to take the easy way out by assuming that it must be a distro thing 
rather then anything with their product (even in cases where it ended up 
being a simple config thing)


David Lang

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread David Lang


On Wed, 9 Aug 2006, Joshua D. Drake wrote:

even many of the companies that offer support for postgres have this 
problem. the explination is always that they can't test every distro out 
there so they pick a few and support those (this is one of the reasons why 


Ahh and which companies would these be? As a representative of the most 
prominent one in the  US I can tell you that you are not speaking from a 
knowledgeable position.


note I said many, not all. I am aware that your company does not fall into 
this catagory.


David Lang

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[PERFORM] Dell PowerEdge 2950 performance

2006-08-09 Thread Bucky Jordan









Hello,

 

I’ve recently been tasked with scalability/performance
testing of a Dell PowerEdge 2950. This is the one with the new Intel Woodcrest
Xeons. Since I haven’t seen any info on this box posted to the list, I
figured people might be interested in the results, and maybe in return share a
few tips on performance tweaks.

 

After doing some reading on the performance list, I realize
that there’s a preference for Opteron; however, the goal of these experiments
is to see what I can get the 2950 to do. I will also be comparing performance vs.
a 1850 at some point, if there’s any interest I can post those numbers
too.

 

Here’s the hardware:

2xDual Core 3.0 Ghz CPU (Xeon 5160- 1333Mhz FSB, 4 MB shared
cache per socket)

8 GB RAM (DDR2, fully buffered, Dual Ranked, 667 Mhz)

6x300 10k RPM SAS drives

Perc 5i w/256 MB battery backed cache

 

The target application:

Mostly OLAP (large bulk loads, then lots of reporting,
possibly moving to real-time loads in the future). All of this will be run on
FreeBSD 6.1 amd64. (If I have some extra time, I might be able to run a few
tests on linux just for comparison’s sake)

 

Test strategy:

Make sure the RAID is giving reasonable performance:

bonnie++ -d /u/bonnie -s 1000:8k

time bash -c "(dd if=/dev/zero of=bigfile count=125000
bs=8k && sync)"

 

Now, I realize that the above are overly simple, and not
indicative of overall performance, however here’s what I’m seeing:

Single 10K 300 GB drive - ~75 Mb/s on both tests, more or
less

RAID 10, 6 disks (3 sets of mirrored pairs) - ~117 Mb/s

 

The RAID 10 numbers look way off to me, so my next step is
to go test some different RAID configs. I’m going to look at a mirrored
pair, and a striped pair first, just to make sure the setup is sane. Then, RAID
5 x 6 disks, and mirrored pair + raid 10 with 4. Possibly software raid,
however I’m not very familiar with this on FreeBSD.

 

Once I get the RAID giving me reasonable results (I would
think that a raid 10 with 6 10k drives should be able to push >200 MB/s
sustained IO…no?) I will move on to other more DB specific tests. 

 

A few questions:

1) Does anyone have other suggestions for testing raw IO for
the RAID?

 

2) What is reasonable IO (bonnie++, dd) for 4 or 6 disks-
RAID 10?

 

3) For DB tests, I would like to compare performance on the
different RAID configs and vs. the 1850. Maybe to assist also in some basic
postgresql.conf and OS tuning (but that will be saved mostly for when I start
application level testing). I realize that benchmarks don’t necessarily
map to application performance, but it helps me establish a baseline for the
hardware. I’m currently running pgbench, but would like something with a
few more features (but hopefully without too much setup time). I’ve heard
mention of the OSDL’s DBT tests, and I’m specifically interested in
DBT-2 and DBT-3. Any suggestions here?

 

Here’s some initial numbers from pgbench (-s 50 –c
10 –t 100). Please keep in mind that these are default installs of
FreeBSD 6.1 and Postgres 8.1.4- NO tuning yet.

1850: run1: 121 tps, run2: 132 tps, run3: 229 tps

2950: run1: 178 tps, run2: 201 tps, run3:259 tps

 

Obviously neither PG nor FreeBSD are taking advantage of all
the hardware available in either case.

 

I will post the additional RAID numbers shortly…

 

Thanks,

 

Bucky

Re: [PERFORM] most bang for buck with ~ $20,000

2006-08-09 Thread Merlin Moncure

On 8/9/06, Kenji Morishige <[EMAIL PROTECTED]> wrote:

I have unlimited rack space, so 2U is not the issue. The boxes are stored in
our lab for internal software tools.  I'm going to research those boxes you
mention.  Regarding the JBOD enclosures, are these generally just 2U or 4U
units with SCSI interface connectors?  I didn't see these types of boxes
availble on Dell website, I'll look again.
-Kenji

On Wed, Aug 09, 2006 at 07:35:22AM +0200, Arjen van der Meijden wrote:
> With such a budget you should easily be able to get something like:
> - A 1U high-performance server (for instance the Dell 1950 with 2x
> Woodcrest 5160, 16GB of FB-Dimm memory, one 5i and one 5e perc raid
> controller and some disks internally)
> - An external SAS direct attached disks storage enclosure full with 15k
> rpm 36GB disks (for instance the MD1000, with 15x 36GB 15k disks)
>
> Going for the dell-solution would set you back "only" (including
> savings) about $13-$14k. HP offers a similar solutions (a HP DL360G5 or
> a DL380G5/DL385 with two MSA50's for instance) which also fit in your
> budget afaik. The other players tend to be (a bit) more expensive, force
> you to go with Fibre Channel or "ancient" SCSI external storage ;)
>
> If you'd like to have a product by a generic vendor, have a look at the
> Adaptec JS50 SAS Jbod enclosure or Promise's Vtrak 300 (both offer 12
> sas/sata bays in 2U) for storage.

I am really curious about the Adaptec SAS product to see what it can
do.  If you don't know what SAS is, it is Sata Attached SCSI.  SAS
cables use 4 sata lanes (3gb/sec each) bonded together in a single
cable.  The raid is handled with via the raid controller or the o/s in
a software configuration.

SAS is the evolution of SCSI and I think will ultimately replace scsi
in enterprise setups becuase it is faster, cheaper, and more flexible.
SAS enclosures generally accept sata or sas drives in mix/match
configurations.  so, you get to choose between cheap, large, 7200 rpm
sata drives or small, expensive sas 10k or 15k rpm drives  *in the
same enclosure*.  You also get a compromise drive in the form of the
raptor which is 10k rpm sata drive.

You could buy a 2u 12 drive SAS encloure (3000$), 12 150g raptors
(3000$) and spend another grand on cables/controller and have a
hellishly performing raid system for the money assuming sas performs
like it does on paper.  note that i would not be trying this with my
own money unless I was guaranteed a money back rma for a 30 day
evaluation period.

that would leave you with 12 grand or so to pick up a quad (8 core)
opeteron if you bought it right.

regards,
merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread Joshua D. Drake



Ahh and which companies would these be? As a representative of the 
most prominent one in the  US I can tell you that you are not speaking 
from a knowledgeable position.


note I said many, not all. I am aware that your company does not fall 
into this catagory.


I know, but I am curious as to *what* companies. Any reputable 
PostgreSQL company is going to support Linux as a whole except maybe 
some fringe distros like Gentoo or RedFlag. Not to mention FreeBSD and 
Solaris.


Sincerely,

Joshua D. Drake




David Lang




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread David Lang


On Wed, 9 Aug 2006, Joshua D. Drake wrote:

Ahh and which companies would these be? As a representative of the most 
prominent one in the  US I can tell you that you are not speaking from a 
knowledgeable position.


note I said many, not all. I am aware that your company does not fall into 
this catagory.


I know, but I am curious as to *what* companies. Any reputable PostgreSQL 
company is going to support Linux as a whole except maybe some fringe distros 
like Gentoo or RedFlag. Not to mention FreeBSD and Solaris.


I'm not going to name names in public, but I will point out that different 
companies definitions of what constatutes 'fringe distros' are different. 
For some any linux other then RedHat Enterprise or SuSE is a fringe distro 
(with SuSE being a relativly recent addition, for a while RedHat were 
frequently the only supported distro versions)


and please note, when I'm talking about support, it's not just postgresql 
support, but also hardware/driver support that can run into these problems


David Lang

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread Scott Marlowe

On Wed, 2006-08-09 at 11:37, David Lang wrote:
> On Wed, 9 Aug 2006, Joshua D. Drake wrote:
> 
> >>> Ahh and which companies would these be? As a representative of the most 
> >>> prominent one in the  US I can tell you that you are not speaking from a 
> >>> knowledgeable position.
> >> 
> >> note I said many, not all. I am aware that your company does not fall into 
> >> this catagory.
> >
> > I know, but I am curious as to *what* companies. Any reputable PostgreSQL 
> > company is going to support Linux as a whole except maybe some fringe 
> > distros 
> > like Gentoo or RedFlag. Not to mention FreeBSD and Solaris.
> 
> I'm not going to name names in public, but I will point out that different 
> companies definitions of what constatutes 'fringe distros' are different. 
> For some any linux other then RedHat Enterprise or SuSE is a fringe distro 
> (with SuSE being a relativly recent addition, for a while RedHat were 
> frequently the only supported distro versions)
> 
> and please note, when I'm talking about support, it's not just postgresql 
> support, but also hardware/driver support that can run into these problems

I've run into this as well.  Generally speaking, the larger the company,
the more likely you are to get the "we don't support that" line.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread Steve Atkins



On Aug 9, 2006, at 5:47 AM, Joshua D. Drake wrote:


Alex Turner wrote:
First off - very few third party tools support debian.  Debian is  
a sure
fire way to have an unsupported system.  Use RedHat or SuSe (flame  
me all

you want, it doesn't make it less true).


*cough* BS *cough*

Linux is Linux. It doesn't matter what trademark you put on top of  
it. As long as they are running a current version of Linux (e.g;  
kernel 2.6) they should be fine.


That's really not the case, at least to the degree that makes a
difference between "supported" and "unsupported".


With Debian that may or may not be the case and that could be an  
issue.
To get the best luck, I would suggest (if you want to stay with a  
Debian base) Ubuntu Dapper LTS.


Different Linux distributions include different shared libraries, put
different things in different places and generally break applications
in a variety of different ways (SELinux would be one example of
that commonly seen here).

If I don't QA my application on it, it isn't supported. I can't  
necessarily

replicate problems on Linux distributions I don't have installed in
my QA lab, so I can't guarantee to fix problems that are specific
to that distribution. I can't even be sure that it will install and run
correctly without doing basic QA of the installation process on
that distribution.

And in my case that's just for user space applications. It's got to
be even worse for hardware drivers.

Our usual phrase is "We support RedHat versions *mumble*
only. We expect our application to run correctly on any Linux
distribution, though you may have to install additional shared
libraries."

I'm quite happy with customers running Debian, SuSe or what
have you, as long as they have access to a sysadmin who's
comfortable with that distribution. (I'd probably deny support to
anyone running Gentoo, though :) )

We've never had big problems with people running our apps on
"unsupported" problems, but those users have had to do some
more diagnosis of problems themselves, and we've been less
able to support them than we can users who use the same
distribution we QA on.

(It's not just Linux, either. We "support" Windows XP, but we run
just fine on 2000 and 95/98.)

Cheers,
  Steve


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Hardware upgraded but performance still ain't good

2006-08-09 Thread Joshua D. Drake



and please note, when I'm talking about support, it's not just postgresql 
support, but also hardware/driver support that can run into these problems


I've run into this as well.  Generally speaking, the larger the company,
the more likely you are to get the "we don't support that" line.



/me *chuckles* and whispers to himself.. no wonder were winning.

Sincerely,

Joshua D. Drake


--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Postgresql Performance on an HP DL385 and

2006-08-09 Thread Jim C. Nasby

On Tue, Aug 08, 2006 at 10:45:07PM -0700, Steve Poe wrote:
> Luke,
> 
> I thought so. In my test, I tried to be fair/equal since my Sun box has two
> 4-disc arrays each on their own channel. So, I just used one of them which
> should be a little slower than the 6-disc with 192MB cache.
> 
> Incidently, the two internal SCSI drives, which are on the 6i adapter,
> generated a TPS of 18.
 
You should try putting pg_xlog on the 6 drive array with the data. My
(limited) experience with such a config is that on a good controller
with writeback caching enabled it won't hurt you, and if the internal
drives aren't caching writes it'll probably help you a lot.

> I thought this server would impressive from notes I've read in the group.
> This is why I thought I might be doing something wrong. I stumped which way
> to take this. There is no obvious fault but something isn't right.
> 
> Steve
> 
> On 8/8/06, Luke Lonergan <[EMAIL PROTECTED]> wrote:
> >
> >Steve,
> >
> >> Sun box with 4-disc array (4GB RAM. 4 167GB 10K SCSI RAID10
> >> LSI MegaRAID 128MB). This is after 8 runs.
> >>
> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,us,12,2,5
> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,sy,59,50,53
> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,wa,1,0,0
> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,id,45,26,38
> >>
> >> Average TPS is 75
> >>
> >> HP box with 8GB RAM. six disc array RAID10 on SmartArray 642
> >> with 192MB RAM. After 8 runs, I see:
> >>
> >> intown-vetstar-amd64,08/09/06,Tuesday,23,us,31,0,3
> >> intown-vetstar-amd64,08/09/06,Tuesday,23,sy,16,0,1
> >> intown-vetstar-amd64,08/09/06,Tuesday,23,wa,99,6,50
> >> intown-vetstar-amd64,08/09/06,Tuesday,23,id,78,0,42
> >>
> >> Average TPS is 31.
> >
> >Note that the I/O wait (wa) on the HP box high, low and average are all
> >*much* higher than on the Sun box.  The average I/O wait was 50% of one
> >CPU, which is huge.  By comparison there was virtually no I/O wait on
> >the Sun machine.
> >
> >This is indicating that your HP machine is indeed I/O bound and
> >furthermore is tying up a PG process waiting for the disk to return.
> >
> >- Luke
> >
> >

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Performance with 2 AMD/Opteron 2.6Ghz and 8gig

2006-08-09 Thread Jim C. Nasby

On Mon, Aug 07, 2006 at 10:20:02PM +0200, Steinar H. Gunderson wrote:
> On Mon, Aug 07, 2006 at 04:02:52PM -0400, Alex Turner wrote:
> > Although I for one have yet to see a controller that actualy does this (I
> > believe software RAID on linux doesn't either).
> 
> Linux' software RAID does. See earlier threads for demonstrations.

The real question: will it balance within a single thread?

Cheaper raid setups will balance individual requests between devices,
but good ones should be able to service a single request from both
devices (assuming it's reading more than whatever the stripe size is).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Postgresql Performance on an HP DL385 and

2006-08-09 Thread Steve Poe

Jim,I'll give it a try. However, I did not see anywhere in the BIOS configuration of the 642 RAID adapter to enable writeback. It may have been mislabled cache accelerator where you can give a percentage to read/write. That aspect did not change the performance like the LSI MegaRAID adapter does.
SteveOn 8/9/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
On Tue, Aug 08, 2006 at 10:45:07PM -0700, Steve Poe wrote:> Luke,>> I thought so. In my test, I tried to be fair/equal since my Sun box has two> 4-disc arrays each on their own channel. So, I just used one of them which
> should be a little slower than the 6-disc with 192MB cache.>> Incidently, the two internal SCSI drives, which are on the 6i adapter,> generated a TPS of 18.You should try putting pg_xlog on the 6 drive array with the data. My
(limited) experience with such a config is that on a good controllerwith writeback caching enabled it won't hurt you, and if the internaldrives aren't caching writes it'll probably help you a lot.> I thought this server would impressive from notes I've read in the group.
> This is why I thought I might be doing something wrong. I stumped which way> to take this. There is no obvious fault but something isn't right.>> Steve>> On 8/8/06, Luke Lonergan <
[EMAIL PROTECTED]> wrote:> >> >Steve,> >> >> Sun box with 4-disc array (4GB RAM. 4 167GB 10K SCSI RAID10> >> LSI MegaRAID 128MB). This is after 8 runs.
> >>> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,us,12,2,5> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,sy,59,50,53> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,wa,1,0,0
> >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,id,45,26,38> >>> >> Average TPS is 75> >>> >> HP box with 8GB RAM. six disc array RAID10 on SmartArray 642
> >> with 192MB RAM. After 8 runs, I see:> >>> >> intown-vetstar-amd64,08/09/06,Tuesday,23,us,31,0,3> >> intown-vetstar-amd64,08/09/06,Tuesday,23,sy,16,0,1> >> intown-vetstar-amd64,08/09/06,Tuesday,23,wa,99,6,50
> >> intown-vetstar-amd64,08/09/06,Tuesday,23,id,78,0,42> >>> >> Average TPS is 31.> >> >Note that the I/O wait (wa) on the HP box high, low and average are all
> >*much* higher than on the Sun box.  The average I/O wait was 50% of one> >CPU, which is huge.  By comparison there was virtually no I/O wait on> >the Sun machine.> >> >This is indicating that your HP machine is indeed I/O bound and
> >furthermore is tying up a PG process waiting for the disk to return.> >> >- Luke> >> >--Jim C. Nasby, Sr. Engineering Consultant  
[EMAIL PROTECTED]Pervasive Software  http://pervasive.comwork: 512-231-6117vcard: http://jim.nasby.net/pervasive.vcf
   cell: 512-569-9461

Re: [PERFORM] vacuuming

2006-08-09 Thread Jim C. Nasby

On Tue, Aug 08, 2006 at 10:39:56AM +0300, Eugeny N Dzhurinsky wrote:
> Hello
> 
> I have pg_autovacuum running with the arguments: 
> 
> pg_autovacuum -D -s 120 -v 1
 
It's been a while since I looked at the pg_autovac settings, but I know
that it's threasholds were way, way to high. They were set to something
like 2, when 0.2 is a better idea.

> the database is postgresql 8.0.0
> 
> Sometimes load average on server raises to 20 and it is almost impossible to
> login via SSH
> 
> When I'm logging in finally, I see there is cpu usage: 6% and iowait 95%
> 
> ps ax | grep post gives me 
> 
> postgres: postgres db [local] VACUUM
> 
> Is there some solution to avoid such cases?

Have you turned on vacuum_cost_delay?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] shared_buffer optimization

2006-08-09 Thread Jim C. Nasby

On Tue, Aug 08, 2006 at 08:20:01AM -0400, Christopher Browne wrote:
> I'm not aware of any actual evidence having emerged that it is of any
> value to set shared buffers higher than 1.

http://flightaware.com

They saw a large increase in how many concurrent connections they could
handle when they bumped shared_buffers up from ~10% to 50% of memory.
Back then they had 4G of memory. They're up to 12G right now, but
haven't bumped shared_buffers up.

Every single piece of advice I've seen on shared_buffers comes from the
7.x era, when our buffer management was extremely simplistic. IMO all of
that knowledge was made obsolete when 8.0 came out, and our handling of
shared_buffers has improved ever further since then. This is definately
an area that could use a lot more testing.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] most bang for buck with ~ $20,000

2006-08-09 Thread Jim C. Nasby

On Wed, Aug 09, 2006 at 10:15:27AM -0500, Scott Marlowe wrote:
> Actually, the BIGGEST win comes when you've got battery backed cache on
> your RAID controller.  In fact, I'd spend money on a separate RAID
> controller for xlog with its own cache hitting a simple mirror set
> before I'd spring for more drives on pg_xlog.  The battery backed cache
> on the pg_xlog likely wouldn't need to be big, just there and set to
> write-back.
> 
> Then put all the rest of your cash into disks on a big RAID 10 config,
> and as big of a battery backed cache as you can afford for it and memory
> for the machine.

Actually, my (limited) testing has show than on a good battery-backed
controller, there's no penalty to leaving pg_xlog in with the rest of
PGDATA. This means that the OP could pile all 8 drives into a RAID10,
which would almost certainly do better than 6+2.

Note that some controllers (such as 3ware) need to periodically test the
life of the BBU, and they disable write caching when they do so, which
would tank performance.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Postgresql Performance on an HP DL385 and

2006-08-09 Thread Scott Marlowe

On Wed, 2006-08-09 at 16:11, Steve Poe wrote:
> Jim,
> 
> I'll give it a try. However, I did not see anywhere in the BIOS
> configuration of the 642 RAID adapter to enable writeback. It may have
> been mislabled cache accelerator where you can give a percentage to
> read/write. That aspect did not change the performance like the LSI
> MegaRAID adapter does. 

Nope, that's not the same thing.

Does your raid controller have batter backed cache, or plain or regular
cache?  write back is unsafe without battery backup.

The default is write through (i.e. the card waits for the data to get
written out before acking an fsync).  In write back, the card's driver
writes the data to the bb cache, then returns on an fsync while the
cache gets written out at leisure.  In the event of a loss of power, the
cache is flushed on restart.  

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] most bang for buck with ~ $20,000

2006-08-09 Thread Scott Marlowe

On Wed, 2006-08-09 at 16:35, Jim C. Nasby wrote:
> On Wed, Aug 09, 2006 at 10:15:27AM -0500, Scott Marlowe wrote:
> > Actually, the BIGGEST win comes when you've got battery backed cache on
> > your RAID controller.  In fact, I'd spend money on a separate RAID
> > controller for xlog with its own cache hitting a simple mirror set
> > before I'd spring for more drives on pg_xlog.  The battery backed cache
> > on the pg_xlog likely wouldn't need to be big, just there and set to
> > write-back.
> > 
> > Then put all the rest of your cash into disks on a big RAID 10 config,
> > and as big of a battery backed cache as you can afford for it and memory
> > for the machine.
> 
> Actually, my (limited) testing has show than on a good battery-backed
> controller, there's no penalty to leaving pg_xlog in with the rest of
> PGDATA. This means that the OP could pile all 8 drives into a RAID10,
> which would almost certainly do better than 6+2.

I've seen a few posts that said that before.  I wonder if there's a
point where the single RAID array / controller would get saturated and a
second one would help.  I think most of the testing I've seen so far has
been multiple RAID arrays under the same controller, hasn't it?

> Note that some controllers (such as 3ware) need to periodically test the
> life of the BBU, and they disable write caching when they do so, which
> would tank performance.

ugh, that's a scary thing.  Can you at least schedule it?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] most bang for buck with ~ $20,000

2006-08-09 Thread Jim C. Nasby

On Wed, Aug 09, 2006 at 04:50:30PM -0500, Scott Marlowe wrote:
> On Wed, 2006-08-09 at 16:35, Jim C. Nasby wrote:
> > On Wed, Aug 09, 2006 at 10:15:27AM -0500, Scott Marlowe wrote:
> > > Actually, the BIGGEST win comes when you've got battery backed cache on
> > > your RAID controller.  In fact, I'd spend money on a separate RAID
> > > controller for xlog with its own cache hitting a simple mirror set
> > > before I'd spring for more drives on pg_xlog.  The battery backed cache
> > > on the pg_xlog likely wouldn't need to be big, just there and set to
> > > write-back.
> > > 
> > > Then put all the rest of your cash into disks on a big RAID 10 config,
> > > and as big of a battery backed cache as you can afford for it and memory
> > > for the machine.
> > 
> > Actually, my (limited) testing has show than on a good battery-backed
> > controller, there's no penalty to leaving pg_xlog in with the rest of
> > PGDATA. This means that the OP could pile all 8 drives into a RAID10,
> > which would almost certainly do better than 6+2.
> 
> I've seen a few posts that said that before.  I wonder if there's a
> point where the single RAID array / controller would get saturated and a
> second one would help.  I think most of the testing I've seen so far has
> been multiple RAID arrays under the same controller, hasn't it?
 
Yeah. I've had one client try it so far, but it was a pretty small array
(8 drives, IIRC).

I suspect that by the time you get to a size where you're saturating a
controller, you're looking at enough drives where having two extra
(instead of dedicating them to pg_xlog) won't make much difference.

> > Note that some controllers (such as 3ware) need to periodically test the
> > life of the BBU, and they disable write caching when they do so, which
> > would tank performance.
> 
> ugh, that's a scary thing.  Can you at least schedule it?

Yeah, it's not automatic at all. Which itself is somewhat scarry
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] [BUGS] BUG #2567: High IOWAIT

2006-08-09 Thread Jim C. Nasby

This isn't a bug; moving to pgsql-performance.

On Tue, Aug 08, 2006 at 08:42:02AM +, kumarselvan wrote:
> i have installed the postgres as mentioned in the Install file. it is a 4
> cpu 8 GB Ram Machine installed with Linux Enterprise version 3. when i am
> running a load which will perfrom 40 inserts persecond on 2 tables and 10
> updates per 10seconds on differnt table IOWait on avg going upto 70% due to
> which i am not able to increase the load. Is there is any other way to
> install the postgres on multiprocessor machine.. can any one help me on
> this...

You haven't given us nearly enough information. What kind of hardware is
this? RAID? What changes have you made to postgresql.conf?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] shared_buffer optimization

2006-08-09 Thread Tom Lane

"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> Every single piece of advice I've seen on shared_buffers comes from the
> 7.x era, when our buffer management was extremely simplistic. IMO all of
> that knowledge was made obsolete when 8.0 came out, and our handling of
> shared_buffers has improved ever further since then. This is definately
> an area that could use a lot more testing.

Actually I think it was probably 8.1 that made the significant
difference there, by getting rid of the single point of contention
for shared-buffer management.  I concur that 7.x-era rules of thumb
may well be obsolete --- we need some credible scaling tests ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Postgresql Performance on an HP DL385 and

2006-08-09 Thread Steve Poe

 I believe it does, I'll need to check.Thanks for the correction.

Steve On 8/9/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
On Wed, 2006-08-09 at 16:11, Steve Poe wrote:> Jim,>> I'll give it a try. However, I did not see anywhere in the BIOS> configuration of the 642 RAID adapter to enable writeback. It may have
> been mislabled cache accelerator where you can give a percentage to> read/write. That aspect did not change the performance like the LSI> MegaRAID adapter does.Nope, that's not the same thing.
Does your raid controller have batter backed cache, or plain or regularcache?  write back is unsafe without battery backup.The default is write through (i.e. the card waits for the data to getwritten out before acking an fsync).  In write back, the card's driver
writes the data to the bb cache, then returns on an fsync while thecache gets written out at leisure.  In the event of a loss of power, thecache is flushed on restart.

Re: [PERFORM] Postgresql Performance on an HP DL385 and

2006-08-09 Thread Steve Poe

Scott,Do you know how to activate the writeback on the RAID controller from HP?SteveOn 8/9/06, Scott Marlowe <
[EMAIL PROTECTED]> wrote:On Wed, 2006-08-09 at 16:11, Steve Poe wrote:
> Jim,>> I'll give it a try. However, I did not see anywhere in the BIOS> configuration of the 642 RAID adapter to enable writeback. It may have> been mislabled cache accelerator where you can give a percentage to
> read/write. That aspect did not change the performance like the LSI> MegaRAID adapter does.Nope, that's not the same thing.Does your raid controller have batter backed cache, or plain or regular
cache?  write back is unsafe without battery backup.The default is write through (i.e. the card waits for the data to getwritten out before acking an fsync).  In write back, the card's driverwrites the data to the bb cache, then returns on an fsync while the
cache gets written out at leisure.  In the event of a loss of power, thecache is flushed on restart.

Re: [PERFORM] Postgresql Performance on an HP DL385 and

2006-08-09 Thread Steve Poe

Jim,

I tried as you suggested and my performance dropped by 50%. I went from
a 32 TPS to 16. Oh well.

Steve

On Wed, 2006-08-09 at 16:05 -0500, Jim C. Nasby wrote:
> On Tue, Aug 08, 2006 at 10:45:07PM -0700, Steve Poe wrote:
> > Luke,
> > 
> > I thought so. In my test, I tried to be fair/equal since my Sun box has two
> > 4-disc arrays each on their own channel. So, I just used one of them which
> > should be a little slower than the 6-disc with 192MB cache.
> > 
> > Incidently, the two internal SCSI drives, which are on the 6i adapter,
> > generated a TPS of 18.
>  
> You should try putting pg_xlog on the 6 drive array with the data. My
> (limited) experience with such a config is that on a good controller
> with writeback caching enabled it won't hurt you, and if the internal
> drives aren't caching writes it'll probably help you a lot.
> 
> > I thought this server would impressive from notes I've read in the group.
> > This is why I thought I might be doing something wrong. I stumped which way
> > to take this. There is no obvious fault but something isn't right.
> > 
> > Steve
> > 
> > On 8/8/06, Luke Lonergan <[EMAIL PROTECTED]> wrote:
> > >
> > >Steve,
> > >
> > >> Sun box with 4-disc array (4GB RAM. 4 167GB 10K SCSI RAID10
> > >> LSI MegaRAID 128MB). This is after 8 runs.
> > >>
> > >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,us,12,2,5
> > >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,sy,59,50,53
> > >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,wa,1,0,0
> > >> dbserver-dual-opteron-centos,08/08/06,Tuesday,20,id,45,26,38
> > >>
> > >> Average TPS is 75
> > >>
> > >> HP box with 8GB RAM. six disc array RAID10 on SmartArray 642
> > >> with 192MB RAM. After 8 runs, I see:
> > >>
> > >> intown-vetstar-amd64,08/09/06,Tuesday,23,us,31,0,3
> > >> intown-vetstar-amd64,08/09/06,Tuesday,23,sy,16,0,1
> > >> intown-vetstar-amd64,08/09/06,Tuesday,23,wa,99,6,50
> > >> intown-vetstar-amd64,08/09/06,Tuesday,23,id,78,0,42
> > >>
> > >> Average TPS is 31.
> > >
> > >Note that the I/O wait (wa) on the HP box high, low and average are all
> > >*much* higher than on the Sun box.  The average I/O wait was 50% of one
> > >CPU, which is huge.  By comparison there was virtually no I/O wait on
> > >the Sun machine.
> > >
> > >This is indicating that your HP machine is indeed I/O bound and
> > >furthermore is tying up a PG process waiting for the disk to return.
> > >
> > >- Luke
> > >
> > >
> 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[PERFORM] Beginner optimization questions, esp. regarding Tsearch2 configuration

2006-08-09 Thread Carl Youngblood


I'm trying to optimize a resume search engine that is using Tsearch2
indexes.  It's running on a dual-opteron 165 system with 4GB of ram
and a raid1 3Gb/sec SATA array.  Each text entry is about 2-3K of
text, and there are about 23,000 rows in the search table, with a goal
of reaching about 100,000 rows eventually.

I'm running Ubuntu 6.06 amd64 server edition.  The raid array is a
software-based linux array with LVM on top of it and the file system
for the database mount point is XFS.  The only optimization I've done
so far is to put the following in /etc/sysctl.conf:

kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536

And in postgresql.conf I set the following parameters:

shared_buffers = 131072
work_mem = 65536
max_stack_depth = 4096
max_fsm_pages = 4
max_fsm_relations = 2000

These probably aren't ideal but I was hoping they would perform a
little better than the defaults.  I got the following results from a
pgbench script I picked up off the web:

CHECKPOINT
= sync ==
10 concurrent users...
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 10
number of transactions per client: 100
number of transactions actually processed: 1000/1000
tps = 632.146016 (including connections establishing)
tps = 710.474526 (excluding connections establishing)

Once again I don't know if these results are good or not for my hardware.

I have a couple of questions:

- Does anyone have some good advice for optimizing postgres for
tsearch2 queries?
- I noticed that there are six different postmaster daemons running.
Only one of them is taking up a lot of RAM (1076m virtual and 584m
resident).  The second one is using 181m resident while the others are
less than 20m each.  Is it normal to have multiple postmaster
processes?  Even the biggest process doesn't seem to be using near as
much RAM as I have on this machine.  Is that bad?  What percentage of
my physical memory should I expect postgres to use for itself?  How
can I encourage it to cache more query results in memory?

Thanks in advance for your time.

Carl Youngblood

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

40 matches

Mail list logo