Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Matthew

On Sat, 1 Mar 2008, Craig James wrote:
Right, I do understand that, but reliability is not a top priority in this 
system.  The database will be replicated, and can be reproduced from the raw 
data.


So what you're saying is:

1. Reliability is not important.
2. There's zero write traffic once the database is set up.

If this is true, then RAID-0 is the way to go. I think Greg's options are 
good. Either:


2 discs RAID 1: OS
6 discs RAID 0: database + WAL

which is what we're using here (except with more discs), or:

8 discs RAID 10: everything

However, if reliability *really* isn't an issue, and you can accept 
reinstalling the system if you lose a disc, then there's a third option:


8 discs RAID 0: Everything

Matthew

--
Heat is work, and work's a curse. All the heat in the universe, it's
going to cool down, because it can't increase, then there'll be no
more work, and there'll be perfect peace.  -- Michael Flanders

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


[PERFORM] which is more important? freq of checkpoints or the duration of them?

2008-03-03 Thread Douglas J Hunley
Subject about says it all. Should I be more concerned about checkpoints 
happening 'frequently' or lasting 'longer'? In other words, is it ok to 
checkpoint say, every 5 minutes, if it only last a second or three or better 
to have checkpoints every 10 minutes that last half a minute? Stupid examples 
probably, but you get my point I hope :)
-- 
Douglas J Hunley (doug at hunley.homeip.net) - Linux User #174778
http://doug.hunley.homeip.net

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to choose a disc array for Postgresql?

2008-03-03 Thread Vivek Khera


On Mar 2, 2008, at 11:02 PM, Steve Poe wrote:


It seems the RAID card manufacturers have more to do with failures
than the drives themselves. Have you found a RAID card you did not
have to drop to U160?


The only array for which I've had to drop to U160 on an LSI card is  
the Dell array.  I think the backplane is not fully U320 compliant,  
but I have no real proof.  I had the same seagate drives, which I  
*know* work U320 with an LSI card.


It seems only the Dell-branded LSI cards work with the Dell-branded  
arrays at U320 -- at least they report to be working.


Because I'm leery of Adaptec, and the LSI cards are hard to get decent  
arrays at decent prices, I've moved to using external RAID enclosures  
attached via LSI Fibre Channel cards.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to choose a disc array for Postgresql?

2008-03-03 Thread Vivek Khera


On Mar 2, 2008, at 11:23 PM, Scott Marlowe wrote:


And I've never had any of the problems you list with LSI cards.  The
only issue I've seen is mediocre RAID-10 performance on their cards


I don't fault the LSI card.  The 320-2X is by far one of the fastest  
cards I've ever used, and the most stable under FreeBSD.  The only  
time I've had issue with the LSI cards is with dell-branded disk  
enclosures.


As for using U320 vs. SAS, I guess the decision would be based on  
cost.  The last systems I bought with big disks were over a year ago,  
so I don't know the pricing anymore.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Mark Mielke

Matthew wrote:

On Sat, 1 Mar 2008, Craig James wrote:
Right, I do understand that, but reliability is not a top priority in 
this system.  The database will be replicated, and can be reproduced 
from the raw data.


So what you're saying is:

1. Reliability is not important.
2. There's zero write traffic once the database is set up.

If this is true, then RAID-0 is the way to go. I think Greg's options 
are good. Either:


2 discs RAID 1: OS
6 discs RAID 0: database + WAL

which is what we're using here (except with more discs), or:

8 discs RAID 10: everything


Has anybody been able to prove to themselves that RAID 0 vs RAID 1+0 is 
faster for these sorts of loads? My understanding is that RAID 1+0 *can* 
reduce latency for reads, but that it relies on random access, whereas 
RAID 0 performs best for sequential scans? Does PostgreSQL ever do 
enough random access to make RAID 1+0 shine?


Curious.

Thanks,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Craig James

Matthew wrote:

On Sat, 1 Mar 2008, Craig James wrote:
Right, I do understand that, but reliability is not a top priority in 
this system.  The database will be replicated, and can be reproduced 
from the raw data.


So what you're saying is:

1. Reliability is not important.
2. There's zero write traffic once the database is set up.


Well, I actually didn't say either of those things, but I appreciate the 
feedback.  RAID 0 is an interesting suggestion, but given our constraints, it's 
not an option.  Reliability is important, but not as important as, say, a 
banking system.

And as far as zero write traffic, I don't know where that came from.  It's a 
"hitlist" based system, where complex search results are saved for the user in 
tables, and the write traffic can be quite high.

If this is true, then RAID-0 is the way to go. I think Greg's options 
are good. Either:


2 discs RAID 1: OS
6 discs RAID 0: database + WAL

which is what we're using here (except with more discs), or:

8 discs RAID 10: everything


Right now, an 8-disk RAID 10 is looking like the best choice.  The Dell Perc 6i 
has configurations that include a battery-backed cache, so performance should 
be quite good.

However, if reliability *really* isn't an issue, and you can accept 
reinstalling the system if you lose a disc, then there's a third option:


8 discs RAID 0: Everything


I imagine the MTBF on a system like this would be < 1 year, which is out of the 
question, even with a backup system that can take over.  A failure completely 
wipes the system, OS and everything, so you're guaranteed that once or twice a 
year, you have to rebuild your system from the ground up.  I'd rather spend that 
time at the beach!

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Matthew

On Mon, 3 Mar 2008, Mark Mielke wrote:
Has anybody been able to prove to themselves that RAID 0 vs RAID 1+0 is 
faster for these sorts of loads? My understanding is that RAID 1+0 *can* 
reduce latency for reads, but that it relies on random access, whereas RAID 0 
performs best for sequential scans? Does PostgreSQL ever do enough random 
access to make RAID 1+0 shine?


Theoretically the performance of RAID 0 and RAID 10 should be identical 
for reads, both seeks and throughput, assuming you have a sensible 
readahead and a good controller. For writes, RAID 10 needs to write to 
multiple drives, so is slower. Whether this is true in reality is another 
matter, as all sorts of factors come in, not least how good your 
controller is at managing the arrangement.


Matthew

--
The only secure computer is one that's unplugged, locked in a safe,
and buried 20 feet under the ground in a secret location...and i'm not
even too sure about that one. --Dennis Huges, FBI

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to choose a disc array for Postgresql?

2008-03-03 Thread Vivek Khera


On Mar 3, 2008, at 12:16 AM, Greg Smith wrote:

I've collected up many of the past list comments on this subject and  
put a summary athttp://www.postgresqldocs.org/index.php/SCSI_vs._IDE/SATA_Disks


I'll add a recommendation of Partners Data Systems http://www.partnersdata.com/ 
 as a great vendor of SATA RAID subsystems (the 16-disk units I have  
are based on an Areca controller and have dual FC output)



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Mark Mielke

Matthew wrote:

On Mon, 3 Mar 2008, Mark Mielke wrote:
Has anybody been able to prove to themselves that RAID 0 vs RAID 1+0 
is faster for these sorts of loads? My understanding is that RAID 1+0 
*can* reduce latency for reads, but that it relies on random access, 
whereas RAID 0 performs best for sequential scans? Does PostgreSQL 
ever do enough random access to make RAID 1+0 shine?
Theoretically the performance of RAID 0 and RAID 10 should be 
identical for reads, both seeks and throughput, assuming you have a 
sensible readahead and a good controller. For writes, RAID 10 needs to 
write to multiple drives, so is slower. Whether this is true in 
reality is another matter, as all sorts of factors come in, not least 
how good your controller is at managing the arrangement.


I don't think your statement that they should be identical is true - 
RAID 1+0 can satisfy and given read from at least two drives. A good 
controller can satisfy half the reads from one side of the array, and 
half the reads from the other side of the array, where the first set 
does not have to wait for the second set, before continuing. To 
contrast, sequential reads of a RAID 1+0 system is almost always HALF of 
the speed of sequential reads of a RAID 0 system. The hardware 
read-ahead on the RAID 1+0 system is being wasted as even if you did 
leap from one side of the array to the other, each side ends up 
"skipping" the data served by the other side, making any caching 
ineffective.


The question I have is not whether RAID 1+0 vs RAID 0 show different 
characteristics. I know they do based upon my own analysis. My question 
is whether PostgreSQL disk access patterns for certain loads ever 
benefit from RAID 1+0, or whether RAID 1+0 is always a bad choice for 
performance-only (completely ignore reliability) loads.


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] which is more important? freq of checkpoints or the duration of them?

2008-03-03 Thread Scott Marlowe
On Mon, Mar 3, 2008 at 8:25 AM, Douglas J Hunley <[EMAIL PROTECTED]> wrote:
> Subject about says it all. Should I be more concerned about checkpoints
>  happening 'frequently' or lasting 'longer'? In other words, is it ok to
>  checkpoint say, every 5 minutes, if it only last a second or three or better
>  to have checkpoints every 10 minutes that last half a minute? Stupid examples
>  probably, but you get my point I hope :)

The answer is, of course, it depends.

If you do a lot of batch processing where you move a lot of data in a
stream into the database, then less, but larger checkpoints are
probably a win.

Or is this a transactional system that has to run transactions in
under x seconds?  Then more, smaller checkpoints might make sense.

And then, you might be better off using the bgwriter.  If tuned
properly, it will keep ahead of your checkpoints just enough that they
never have to happen.  Comes with a price, some small % of performance
loss peak, in exchange for a smoother behaviour.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


[PERFORM] Performance problems deleting data

2008-03-03 Thread Rafael Martinez
Hello

---
Postgresql version: 8.1.10
4GB RAM
2x HP 72GB 10K SAS RAID1/smartarray
---

I have a colleague that is having som performance problems from time to
time when deleting some rows from a table.

We found out that the database having this problem had a severe bloat
problem in many tables and indexes (they were running only autovacuum)
and some misconfiguration in postgresql.conf.

What we did to fix the situation was:

1) Stop the application accessing the database.
2) Change these parameters in postgresql.conf:
-
shared_buffers = 108157
work_mem = 16384
maintenance_work_mem = 262144

max_fsm_pages = 80

wal_buffers = 64
checkpoint_segments = 128

random_page_cost = 2.0
effective_cache_size = 255479

default_statistics_target = 400
-

3) Update /etc/sysctl.conf with new values for kernel.shmmax and
kernel.shmall

3) Run 'VACUUM FULL VERBOSE'
4) Run 'REINDEX DATABASE '
5) Run 'ANALYZE VERBOSE'
6) Define a 'VACUUM VERBOSE ANALYZE' in crontab
7) Start the application.

These changes helped a lot, the size of the database when down from 7GB
to 1GB and most of the deletes work as they are suppose to. But from
time to time a single deletion takes a lot of time to finish. The output
from explain analyze doesn't show anything wrong, as long as I can see.

The definition of the table 'module' is:
-
manage=# \d module
Table "public.module"
  Column   |Type | Modifiers
---+-+---
 moduleid  | integer | not null default
nextval('module_moduleid_seq'::regclass)
 deviceid  | integer | not null
 netboxid  | integer | not null
 module| integer | not null
 model | character varying   |
 descr | character varying   |
 up| character(1)| not null default 'y'::bpchar
 downsince | timestamp without time zone |
Indexes:
"module_pkey" PRIMARY KEY, btree (moduleid)
"module_deviceid_key" UNIQUE, btree (deviceid)
"module_netboxid_key" UNIQUE, btree (netboxid, module)
Check constraints:
"module_up" CHECK (up = 'y'::bpchar OR up = 'n'::bpchar)
Foreign-key constraints:
"$1" FOREIGN KEY (deviceid) REFERENCES device(deviceid) ON UPDATE
CASCADE ON DELETE CASCADE
"$2" FOREIGN KEY (netboxid) REFERENCES netbox(netboxid) ON UPDATE
CASCADE ON DELETE CASCADE
Rules:
close_alerthist_modules AS
ON DELETE TO module DO  UPDATE alerthist SET end_time = now()
  WHERE (alerthist.eventtypeid::text = 'moduleState'::text OR
alerthist.eventtypeid::text = 'linkState'::text) AND alerthist.end_time
= 'infinity'::timestamp without time zone AND alerthist.deviceid =
old.deviceid
-


manage=# EXPLAIN ANALYZE DELETE FROM module WHERE deviceid='7298';
QUERY PLAN
-
 Nested Loop  (cost=0.00..14.63 rows=1 width=67) (actual
time=2.365..2.365 rows=0 loops=1)
   ->  Index Scan using alerthist_end_time_btree on alerthist
(cost=0.00..10.65 rows=1 width=67) (actual time=2.363..2.363 rows=0 loops=1)
 Index Cond: (end_time = 'infinity'::timestamp without time zone)
 Filter: eventtypeid)::text = 'moduleState'::text) OR
((eventtypeid)::text = 'linkState'::text)) AND (7298 = deviceid))
   ->  Index Scan using module_deviceid_key on module  (cost=0.00..3.96
rows=1 width=4) (never executed)
 Index Cond: (deviceid = 7298)
 Total runtime: 2.546 ms

 Index Scan using module_deviceid_key on module  (cost=0.00..3.96 rows=1
width=6) (actual time=0.060..0.061 rows=1 loops=1)
   Index Cond: (deviceid = 7298)
 Trigger for constraint $1: time=3.422 calls=1
 Trigger for constraint $1: time=0.603 calls=1
 Total runtime: 2462558.813 ms
(13 rows)
-

Any ideas why it is taking 2462558.813 ms to finish when the total time
for the deletion is 2.546 ms + 3.422 ms + 0.603ms?

The deletion of a row in the 'module' table involves several
deletions/updates in many other tables in the database related by
foreign keys (with ON DELETE CASCADE) and triggers.

I suppose that an open transaction in one of these not directly releated
tables to 'module' could lock the deletion without showing in EXPLAIN
ANALYZE?. The two 'Trigger for constraint' in the EXPLAIN ANALYZE output
only show two tables having an attribute as a foreign key in 'module',
but if these two tables have to wait for other tables, that would not
show anywhere? (only in pg_locks)

Thanks in advance
regards
-- 
 Rafael Martinez, <[EMAIL PROTECTED]>
 Center for Information Technology Services
 Uni

Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Scott Marlowe
On Mon, Mar 3, 2008 at 8:48 AM, Mark Mielke <[EMAIL PROTECTED]> wrote:
> Matthew wrote:
>  > On Sat, 1 Mar 2008, Craig James wrote:
>  >> Right, I do understand that, but reliability is not a top priority in
>  >> this system.  The database will be replicated, and can be reproduced
>  >> from the raw data.
>  >
>  > So what you're saying is:
>  >
>  > 1. Reliability is not important.
>  > 2. There's zero write traffic once the database is set up.
>  >
>  > If this is true, then RAID-0 is the way to go. I think Greg's options
>  > are good. Either:
>  >
>  > 2 discs RAID 1: OS
>  > 6 discs RAID 0: database + WAL
>  >
>  > which is what we're using here (except with more discs), or:
>  >
>  > 8 discs RAID 10: everything
>
>  Has anybody been able to prove to themselves that RAID 0 vs RAID 1+0 is
>  faster for these sorts of loads? My understanding is that RAID 1+0 *can*
>  reduce latency for reads, but that it relies on random access, whereas
>  RAID 0 performs best for sequential scans? Does PostgreSQL ever do
>  enough random access to make RAID 1+0 shine?

RAID 1+0 has certain theoretical advantages in parallel access
scenarios that straight RAID-0 wouldn't have.  I.e. if you used n>2
disks in a mirror and built a RAID-0 out of those types of mirrors,
then you could theoretically have n users reading data on the same
"drive"  (the raid-1 underneath the raid-0) at the same time where
RAID-0 would only have the one disk to read from.  The effects of this
advantage are dulled by caching, depending on how much of the data set
you can cache.  With a system that can cache it's whole data set in
memory (not uncommon for transactional systems) or at least a large
percentage, the n>2 RAID-1 sets aren't that big of an advantage.

RAID-0 of n drives should behave pretty similarly to RAID-10 with 2n
drives for most types of access.  I.e. no better or worse for
sequential or random access, if the number of drives is equivalent.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] which is more important? freq of checkpoints or the duration of them?

2008-03-03 Thread Greg Smith

On Mon, 3 Mar 2008, Douglas J Hunley wrote:

In other words, is it ok to checkpoint say, every 5 minutes, if it only 
last a second or three or better to have checkpoints every 10 minutes 
that last half a minute?


When checkpoints do too much work at once they will block clients for a 
significant period of time near the end--anywhere from 2 to 8 seconds 
isn't unusual.  Every client on the system will just hang, then they all 
start responding again in a batch when the checkpoint is finished.


With that as the problematic case, if you can keep the duration of the 
checkpoint processing minimal by having them happen more frequently, then 
that's the better approach.  You can't push that interval too small though 
or your system will be continuously checkpointing.


In cases where checkpoints hurt no matter how often you do them, there it 
makes sense to have them as infrequently as possible so at least you 
minimize the number of times that the disruption happens.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] which is more important? freq of checkpoints or the duration of them?

2008-03-03 Thread Greg Smith

On Mon, 3 Mar 2008, Chris Browne wrote:

Now, if you have reasonable settings (I'm not sure how well its tuning 
is documented :-(), checkpoint "flushes" should be able to be short, 
however infrequent they may be. In effect, the "oops, the database got 
blocked by checkpoint flushing" issue should now be gone...


Ah, if only it were true.  The background writer can be made to work 
fairly well in circa early 8.1 setups where the shared_buffers cache is 
small.  But on more current systems where there's a lot of memory 
involved, you can't get a tuning aggressive enough to make checkpoint 
spikes small without wasting a bunch of I/O writing buffers that will just 
get dirty again before the checkpoint.  Since the kinds of systems that 
have nasty checkpoint spikes are also I/O bound in general, there is no 
good way to resolve that struggle using the code in 8.1 and 8.2.


The checkpoint_completion_target tunable and related code in 8.3 is the 
first approach to this issue that has a good foundation even with larger 
buffer caches.  You can band-aid some cases well enough to improve things 
with the background writer in earlier versions, but it's certainly not 
guaranteed that it's possible even if you spend lots of time fiddling with 
the settings.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Performance problems deleting data

2008-03-03 Thread Tom Lane
Rafael Martinez <[EMAIL PROTECTED]> writes:
> manage=# EXPLAIN ANALYZE DELETE FROM module WHERE deviceid='7298';
> QUERY PLAN
> -
>  Nested Loop  (cost=0.00..14.63 rows=1 width=67) (actual
> time=2.365..2.365 rows=0 loops=1)
>->  Index Scan using alerthist_end_time_btree on alerthist
> (cost=0.00..10.65 rows=1 width=67) (actual time=2.363..2.363 rows=0 loops=1)
>  Index Cond: (end_time = 'infinity'::timestamp without time zone)
>  Filter: eventtypeid)::text = 'moduleState'::text) OR
> ((eventtypeid)::text = 'linkState'::text)) AND (7298 = deviceid))
>->  Index Scan using module_deviceid_key on module  (cost=0.00..3.96
> rows=1 width=4) (never executed)
>  Index Cond: (deviceid = 7298)
>  Total runtime: 2.546 ms

>  Index Scan using module_deviceid_key on module  (cost=0.00..3.96 rows=1
> width=6) (actual time=0.060..0.061 rows=1 loops=1)
>Index Cond: (deviceid = 7298)
>  Trigger for constraint $1: time=3.422 calls=1
>  Trigger for constraint $1: time=0.603 calls=1
>  Total runtime: 2462558.813 ms
> (13 rows)
> -

> Any ideas why it is taking 2462558.813 ms to finish when the total time
> for the deletion is 2.546 ms + 3.422 ms + 0.603ms?

That's just bizarre.  So far as I can see from the 8.1 EXPLAIN code,
the only place the extra time could be spent is in ExecutorStart,
ExecutorEnd, or the top level of ExecutorRun, none of which should
take any noticeable amount of runtime in a trivial query like this.

The only thing I can think of is that ExecutorStart would have been
where we'd acquire RowExclusiveLock on "module", while the previous
rule-generated query would only take AccessShareLock.  So if for
instance some other transaction had ShareLock (perhaps from CREATE
INDEX) and just sat a long time before committing, perhaps this
would be explainable.  I'm not too sure about that explanation
though because I think the parser should have already taken
RowExclusiveLock when it was doing parse analysis.

Is the problem repeatable?  Is the delay consistent?  What do
you see in pg_locks while it's delaying?  Also watch "vmstat 1"
output --- is it consuming CPU and/or I/O?

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] which is more important? freq of checkpoints or the duration of them?

2008-03-03 Thread Chris Browne
[EMAIL PROTECTED] (Douglas J Hunley) writes:
> Subject about says it all. Should I be more concerned about checkpoints 
> happening 'frequently' or lasting 'longer'? In other words, is it ok to 
> checkpoint say, every 5 minutes, if it only last a second or three or better 
> to have checkpoints every 10 minutes that last half a minute? Stupid examples 
> probably, but you get my point I hope :)

Well, with the (new-in-8.1) background writer, you should be able to
have whatever combination might appear attractive, as the point of the
background writer is to push out dirty pages.

Pre-8.1, your choice would be either to:
a) Flush frequently, and so have the checkpoints be of short duration, or
b) Flush infrequently, so that the checkpoint flushes would have a long
   duration.

Now, if you have reasonable settings (I'm not sure how well its tuning
is documented :-(), checkpoint "flushes" should be able to be short,
however infrequent they may be.

In effect, the "oops, the database got blocked by checkpoint flushing"
issue should now be gone...

The issue that then remains is whether to checkpoint often, in which
case crash recovery will tend to be be quicker, or whether to
checkpoint seldom, in which case crash recovery will have fewer
checkpoints to choose from, and hence will run somewhat longer.

If your systems don't crash much, and recovery time isn't a big deal,
then this probably doesn't much matter...
-- 
(reverse (concatenate 'string "ofni.sesabatadxunil" "@" "enworbbc"))
http://linuxfinances.info/info/sap.html
"I don't plan to maintain it, just to install it." -- Richard M. Stallman

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] How to choose a disc array for Postgresql?

2008-03-03 Thread Ron Mayer

Greg Smith wrote:

On Sat, 1 Mar 2008, Steve Poe wrote:

SATA over SCSI?


I've collected up many of the past list comments on this subject and put 
a summary at 
http://www.postgresqldocs.org/index.php/SCSI_vs._IDE/SATA_Disks


Should this section:

 ATA Disks... Always default to the write cache
 enabledturn it off

be amended to say that if you have an OS that supports write
barriers (linuxes newer than early 2005) you shouldn't worry
about this?

And perhaps the SCSI section should also be amended to say that
that the same 2.6 kernels that fail to send the IDE FLUSH CACHE
command also fail to send the SCSI SYNCHRONIZE CACHE command,
so you should go through the same cache-disabling hoops there?


References from the Linux SATA driver guy and lwn here:
http://hardware.slashdot.org/comments.pl?sid=149349&cid=12519114
http://lwn.net/Articles/77074/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


[PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread alan bryan
I've got a new server and am myself new to tuning postgres.

Server is an 8 core Xeon 2.33GHz, 8GB RAM, RAID 10 on a 3ware 9550SX-4LP w/ BBU.

It's serving as the DB for a fairly write intensive (maybe 25-30%) Web
application in PHP.  We are not using persistent connections, thus the
high max connections.

I've done the following so far:

> cat /boot/loader.conf
kern.ipc.semmni=256
kern.ipc.semmns=512
kern.ipc.semmnu=256

> cat /etc/sysctl.conf
kern.ipc.shmall=393216
kern.ipc.shmmax=1610612736
kern.ipc.semmap=256
kern.ipc.shm_use_phys=1

postgresql.conf settings (changed from Default):
max_connections = 180
shared_buffers = 1024MB
maintenance_work_mem = 128MB
wal_buffers = 1024kB

I then set up a test database for running pgbench with scaling factor
100. I then ran:
> pgbench -c 100 -t 1000 testdb
and got:
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 100
number of transactions per client: 1000
number of transactions actually processed: 10/10
tps = 557.095867 (including connections establishing)
tps = 558.013714 (excluding connections establishing)

Just for testing, I tried turning off fsync and got:
> pgbench -c 100 -t 1000 testdb
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 100
number of transactions per client: 1000
number of transactions actually processed: 10/10
tps = 4014.075114 (including connections establishing)
tps = 4061.662041 (excluding connections establishing)

Do these numbers sound inline with what I should be seeing?  What else
can I do to try to get better performance in the more general sense
(knowing that specifics are tied to real world data and testing).  Any
hints for FreeBSD specific tuning would be helpful.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread Bill Moran
"alan bryan" <[EMAIL PROTECTED]> wrote:
>
> I've got a new server and am myself new to tuning postgres.
> 
> Server is an 8 core Xeon 2.33GHz, 8GB RAM, RAID 10 on a 3ware 9550SX-4LP w/ 
> BBU.
> 
> It's serving as the DB for a fairly write intensive (maybe 25-30%) Web
> application in PHP.  We are not using persistent connections, thus the
> high max connections.
> 
> I've done the following so far:
> 
> > cat /boot/loader.conf
> kern.ipc.semmni=256
> kern.ipc.semmns=512
> kern.ipc.semmnu=256
> 
> > cat /etc/sysctl.conf
> kern.ipc.shmall=393216
> kern.ipc.shmmax=1610612736

I would just set this to 2G (which is the max).  It doesn't really hurt
anything if you don't use it all.

> kern.ipc.semmap=256
> kern.ipc.shm_use_phys=1
> 
> postgresql.conf settings (changed from Default):
> max_connections = 180
> shared_buffers = 1024MB

Why not 2G, which would be 25% of total memory?

> maintenance_work_mem = 128MB
> wal_buffers = 1024kB
> 
> I then set up a test database for running pgbench with scaling factor
> 100. I then ran:
> > pgbench -c 100 -t 1000 testdb
> and got:
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 100
> number of clients: 100
> number of transactions per client: 1000
> number of transactions actually processed: 10/10
> tps = 557.095867 (including connections establishing)
> tps = 558.013714 (excluding connections establishing)
> 
> Just for testing, I tried turning off fsync and got:
> > pgbench -c 100 -t 1000 testdb
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 100
> number of clients: 100
> number of transactions per client: 1000
> number of transactions actually processed: 10/10
> tps = 4014.075114 (including connections establishing)
> tps = 4061.662041 (excluding connections establishing)
> 
> Do these numbers sound inline with what I should be seeing?  What else
> can I do to try to get better performance in the more general sense
> (knowing that specifics are tied to real world data and testing).  Any
> hints for FreeBSD specific tuning would be helpful.

Are you running FreeBSD 7?  If performance is of the utmost importance,
then you need to be running the 7.X branch.

Based on your pgbench results, I'm guessing you didn't get battery-backed
cache on your systems?  That makes a big difference no matter what OS
you're using.

Besides that, I can't think of any FreeBSD-specific things to do.  Basically,
general tuning advice applies to FreeBSD as well as to most other OS.

-- 
Bill Moran
Collaborative Fusion Inc.

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread alan bryan
On Mon, Mar 3, 2008 at 4:26 PM, Bill Moran
<[EMAIL PROTECTED]> wrote:

>  > > cat /boot/loader.conf
>  > kern.ipc.semmni=256
>  > kern.ipc.semmns=512
>  > kern.ipc.semmnu=256
>  >
>  > > cat /etc/sysctl.conf
>  > kern.ipc.shmall=393216
>  > kern.ipc.shmmax=1610612736
>
>  I would just set this to 2G (which is the max).  It doesn't really hurt
>  anything if you don't use it all.

I'll try that and report back.


>  > kern.ipc.semmap=256
>  > kern.ipc.shm_use_phys=1
>  >
>  > postgresql.conf settings (changed from Default):
>  > max_connections = 180
>  > shared_buffers = 1024MB
>
>  Why not 2G, which would be 25% of total memory?


Ditto - I'll report back.



>  Are you running FreeBSD 7?  If performance is of the utmost importance,
>  then you need to be running the 7.X branch.
>
>  Based on your pgbench results, I'm guessing you didn't get battery-backed
>  cache on your systems?  That makes a big difference no matter what OS
>  you're using.
>
>  Besides that, I can't think of any FreeBSD-specific things to do.  Basically,
>  general tuning advice applies to FreeBSD as well as to most other OS.

Yes, FreeBSD 7.0-Release.  Tried both the 4BSD and ULE schedulers and
didn't see much difference with this test.
I do have the Battery for the 3ware and it is enabled.  I'll do some
bonnie++ benchmarks and make sure disk is near where it should be.

Should turning off fsync make things roughly 8x-10x faster?  Or is
that indicative of something not being correct or tuned quite right in
the rest of the system?  I'll have to run in production with fsync on
but was just testing to see how much of an effect it had.

Thanks,
Alan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread Greg Smith

On Mon, 3 Mar 2008, alan bryan wrote:


pgbench -c 100 -t 1000 testdb

tps = 558.013714 (excluding connections establishing)

Just for testing, I tried turning off fsync and got:
tps = 4061.662041 (excluding connections establishing)


This is odd.  ~500 is what I expect from this test when there is no write 
cache to accelerate fsync, while ~4000 is normal for your class of 
hardware when you have such a cache.  Since you say your 3Ware card is 
setup with a cache and a BBU, that's suspicious--you should be able to get 
around 4000 with fsync on.  Any chance you have the card set to 
write-through instead of write-back?  That's the only thing that comes to 
mind that would cause this.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-performance