Re: [PERFORM] db server load

2009-01-12 Thread Stefano Nichele

Hi Scott,
as you know since the other thread, I performed some tests:

-bash-3.1$ pgbench -c 50 -t 1000
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 5/5
tps = 377.351354 (including connections establishing)
tps = 377.788377 (excluding connections establishing)

Some vmstat samplings in the meantime:

procs ---memory-- ---swap-- -io --system-- 
-cpu--
r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy 
id wa st
0  4 92 127880   8252 329451200   458 12399 2441 14903 22  9 
34 35  0
11 49 92 125336   8288 329701600   392 11071 2390 11568 17  
7 51 24  0
0  2 92 124548   8304 329776400   126  8249 2291 3829  5  3 
64 28  0
0  1 92 127268   7796 329567200   493 11387 2323 14221 23  
9 47 21  0
0  2 92 127256   7848 329549200   501 10654 2215 14599 24  
9 42 24  0
0  2 92 125772   7892 32956560034  7541 2311  327  0  1 
59 40  0
0  1 92 127188   7952 329408400   537 11039 2274 15460 23 
10 43 24  0
7  4 92 123816   7996 329862000   253  8946 2284 7310 11  5 
52 32  0
0  2 92 126652   8536 329422000   440  9563 2307 9036 13  6 
56 25  0
0 10 92 125268   8584 329611600   426 10696 2285 11034 20  
9 39 32  0
0  2 92 124168   8604 329725200   104  8385 2319 4162  3  3 
40 54  0
0  8 92 123780   8648 329645600   542 11498 2298 16613 25 
10 16 48  0



-bash-3.1$ pgbench -t 1 -c 50
starting vacuum...end.
transaction type: SELECT only
scaling factor: 100
number of clients: 50
number of transactions per client: 1
number of transactions actually processed: 50/50
tps = 8571.573651 (including connections establishing)
tps = 8594.357138 (excluding connections establishing)


-bash-3.1$ pgbench -t 1 -c 50 -S
starting vacuum...end.
transaction type: SELECT only
scaling factor: 100
number of clients: 50
number of transactions per client: 1
number of transactions actually processed: 50/50
tps = 8571.573651 (including connections establishing)
tps = 8594.357138 (excluding connections establishing)


(next test is with scaling factor 1)

-bash-3.1$ pgbench -t 2 -c 8 -S pgbench
starting vacuum...end.
transaction type: SELECT only
scaling factor: 1
number of clients: 8
number of transactions per client: 2
number of transactions actually processed: 16/16
tps = 11695.895318 (including connections establishing)
tps = 11715.603720 (excluding connections establishing)

Any comment ?

I can give you also some details about database usage of my application:
- number of active connections: about 60
- number of idle connections: about 60

Here some number from a mine old pgfouine report:
- query peak: 378 queries/s
- select: 53,1%, insert 3,8%, update 2,2 %, delete 2,8 %

The application is basically a web application and the db size is 37 GB.

How would you classify the load ? small/medium/high ?

Cheers,
ste

Scott Marlowe wrote:

On Fri, Dec 12, 2008 at 3:07 AM, Stefano Nichele
 wrote:
  

Hi All,
I would like to ask to you, how many connections a db server can handle. I
know the question is not so easy, and actually I don't want to known a
"number" but something like:
- up to 100 connections: small load, low entry server is enough
- up to 200 connections: the db server starts to sweat, you need a dedicated
medium server
- up to 300 connections: hard work, dedicated server
- up to 500 connections: hard work, dedicated high level server

I would like just to understand when we can talk about small/medium/high
load.



Well, it's of course more than just how many connections you have.
What percentage of the connections are idle?  Are you running small
tight multi-statement transactions, or huge reporting queries?  The db
server we have at work routinely has 100+ connections, but of those,
there are only a dozen or so actively running, and they are small and
transactional in nature.  The machine handling this is very
overpowered, with 8 opteron cores and 12 disks in a RAID-10 for data
and 2 in another RAID-10 for everything else (pg_xlog, logging, etc)
under a very nice hardware RAID card with battery backed cache.  We've
tested it to much higher loads and it's held up quite well.

With the current load, and handling a hundred or so transactions per
second, the top of top looks like this:

top - 14:40:55 up 123 days,  2:24,  1 user,  load average: 1.08, 0.97, 1.04
Tasks: 284 total,   1 running, 283 sleeping,   0 stopped,   0 zombie
Cpu0  :  2.8%us,  0.4%sy,  0.0%ni, 96.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  2.5%us,  0.3%sy,  0.0%ni, 97.2%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  2.5%us,  0.2%sy,  0.0%ni, 97.1%id,  0.1%wa,  0.1%hi,  0.0%si,  0.0%st
Cpu3  : 10.0%us,  0.7%sy,  0.0%ni, 89.0%id,  0.1%wa,  0.0%hi,  0.2%si,  0.0%st
Cpu4  : 

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-12 Thread Matthew Wakeling

On Sun, 11 Jan 2009, M. Edward (Ed) Borasky wrote:

Where you *will* have some major OS risk is with testing-level software
or "bleeding edge" Linux distros like Fedora.


I just ran "uptime" on my home machine, and it said 144 days. Debian 
unstable, on no-name hardware. I guess the last time I rebooted was 
when I changed the graphics drivers. I can't remember the last time it 
actually crashed.


I guess the moral is, you find a combination of hardware and software that 
works well, and you may as well stick with it. The most likely things to 
destabilise the system are drivers for "interesting" pieces of hardware, 
like graphics cards and (unfortunately) some high-performance RAID cards.


Matthew

--
A good programmer is one who looks both ways before crossing a one-way street.
Considering the quality and quantity of one-way streets in Cambridge, it
should be no surprise that there are so many good programmers there.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] multicolumn indexes still efficient if not fully stressed?

2009-01-12 Thread Jörg Kiegeland

Hello,

I created a multicolumn index on the columns c_1,..,c_n .
If I do use only a true subset of these columns in a SQL query, is the 
index still efficient?

Or is it better to create another multicolumn index defined on this subset?

Thanks for any comments!

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] multicolumn indexes still efficient if not fullystressed?

2009-01-12 Thread Mark Lewis
On Mon, 2009-01-12 at 18:49 +0100, Jörg Kiegeland wrote:
> Hello,
> 
> I created a multicolumn index on the columns c_1,..,c_n .
> If I do use only a true subset of these columns in a SQL query, is the 
> index still efficient?
> Or is it better to create another multicolumn index defined on this subset?
> 
> Thanks for any comments!

Why would you create a multicolumn index for all columns if that's not
what you actually query on?

The order of columns matter for multicolumn indexes.  Multicolumn
indexes work best for queries that use all of the columns in the index,
but can also be helpful if at least the leftmost columns in the index
are specified in the query.  So it depends on the order.

If the index is defined on (c_1, c_2, c_3, c_4) and your query includes:
"WHERE c_2=val AND c_3=val AND c_4=val", then the index is almost
certainly useless.

On the other hand, if you were to query "WHERE c_1=val" then if c_1 is
highly selective the index would still help.

See here:
http://www.postgresql.org/docs/8.3/interactive/indexes-multicolumn.html

-- Mark Lewis



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] multicolumn indexes still efficient if not fully stressed?

2009-01-12 Thread Andreas Kretschmer
Jörg Kiegeland  schrieb:

> Hello,
>
> I created a multicolumn index on the columns c_1,..,c_n .
> If I do use only a true subset of these columns in a SQL query, is the  
> index still efficient?
> Or is it better to create another multicolumn index defined on this subset?

Create several indexes for each column, since 8.1 PG can use a so called
Bitmap Index Scan. Read more about that:

- http://www.postgresql-support.de/pgbook/node492.html
  (in german, i think, you can read that)

- http://en.wikipedia.org/wiki/Bitmap_index


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] multicolumn indexes still efficient if not fully stressed?

2009-01-12 Thread Scott Marlowe
On Mon, Jan 12, 2009 at 12:23 PM, Andreas Kretschmer
 wrote:
> Jörg Kiegeland  schrieb:
>
>> Hello,
>>
>> I created a multicolumn index on the columns c_1,..,c_n .
>> If I do use only a true subset of these columns in a SQL query, is the
>> index still efficient?
>> Or is it better to create another multicolumn index defined on this subset?
>
> Create several indexes for each column, since 8.1 PG can use a so called
> Bitmap Index Scan. Read more about that:

I've found that when you do frequently query on two or more columns, a
multi-column index is faster than bitmap scans, especially for larger
data sets.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] multicolumn indexes still efficient if not fully stressed?

2009-01-12 Thread Alan Hodgson
On Monday 12 January 2009, "Scott Marlowe"  wrote:
> I've found that when you do frequently query on two or more columns, a
> multi-column index is faster than bitmap scans, especially for larger
> data sets.

Very much faster, especially if you're only looking for a few dozen or 
hundred rows out of multi-million row tables.

-- 
Current Peeve: The mindset that the Internet is some sort of school for
novice sysadmins and that everyone -not- doing stupid dangerous things
should act like patient teachers with the ones who are. -- Bill Cole, NANAE 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] slow query

2009-01-12 Thread Scott Marlowe
OK, I've got a query that's running slow the first time, then fast.
But I can't see where the time is really being spend on the first run.
 Query and plan attached to preserve formatting.

The index scan and nested loop that feed the next layer up nested loop
both show very short run times.  Yet the nested loop they feed to
takes 30 seconds to run.  If I run the query a second time, everything
looks the same but the second nested loop now runs in well under a
second.

I can't figure out where my time's going to.
=# explain analyze SELECT u.*, to_char(u.registration_date, 'Month FMDD, ') 
AS f_registration_date FROM  users u  INNER JOIN (org_relationships_join j1 
INNER JOIN org_relationships_join j2 ON j1.relid = j2.relid) ON u.orgid = 
j2.orgid  WHERE j1.orgid = 142550 AND u.deleted = 0 AND u.type_id < 10  AND ( 
lower(u.lname) LIKE lower('boat%') OR lower(u.fname) LIKE lower('boat%') OR 
lower(u.username) LIKE lower('boat%') OR lower(u.option1) LIKE lower('boat%') 
OR lower(u.email) LIKE lower('%boat%') OR lower(u.external_id) = lower('boat')  
) ORDER BY lower(u.lname), lower(u.fname) ;

QUERY PLAN
-
 Sort  (cost=2625.26..2625.51 rows=99 width=271) (actual 
time=30123.181..30123.183 rows=4 loops=1)
   Sort Key: (lower((u.lname)::text)), (lower((u.fname)::text))
   Sort Method:  quicksort  Memory: 26kB
   ->  Nested Loop  (cost=0.00..2621.98 rows=99 width=271) (actual 
time=26919.298..30123.065 rows=4 loops=1)
 ->  Nested Loop  (cost=0.00..29.36 rows=20 width=4) (actual 
time=0.099..0.383 rows=35 loops=1)
   ->  Index Scan using org_relationships_orgid_idx on 
org_relationships_join j1  (cost=0.00..8.27 rows=1 width=4) (actual 
time=0.048..0.051 rows=1 loops=1)
 Index Cond: (orgid = 142550)
   ->  Index Scan using org_relationships_relid_idx on 
org_relationships_join j2  (cost=0.00..21.00 rows=7 width=8) (actual 
time=0.046..0.222 rows=35 loops=1)
 Index Cond: (j2.relid = j1.relid)
 ->  Index Scan using users_orgid_idx on users u  (cost=0.00..129.52 
rows=5 width=271) (actual time=843.825..860.638 rows=0 loops=35)
   Index Cond: (u.orgid = j2.orgid)
   Filter: ((u.type_id < 10) AND (u.deleted = 0) AND 
((lower((u.lname)::text) ~~ 'boat%'::text) OR (lower((u.fname)::text) ~~ 
'boat%'::text) OR (lower((u.username)::text) ~~ 'boat%'::text) OR 
(lower(u.option1) ~~ 'boat%'::text) OR (lower((u.email)::text) ~~ 
'%boat%'::text) OR (lower(u.external_id) = 'boat'::text)))
 Total runtime: 30123.405 ms


Second run:
 Sort  (cost=2625.30..2625.55 rows=99 width=271) (actual time=116.011..116.012 
rows=4 loops=1)
   Sort Key: (lower((u.lname)::text)), (lower((u.fname)::text))
   Sort Method:  quicksort  Memory: 26kB
   ->  Nested Loop  (cost=0.00..2622.02 rows=99 width=271) (actual 
time=79.531..115.958 rows=4 loops=1)
 ->  Nested Loop  (cost=0.00..29.36 rows=20 width=4) (actual 
time=0.036..0.126 rows=35 loops=1)
   ->  Index Scan using org_relationships_orgid_idx on 
org_relationships_join j1  (cost=0.00..8.27 rows=1 width=4) (actual 
time=0.016..0.018 rows=1 loops=1)
 Index Cond: (orgid = 142550)
   ->  Index Scan using org_relationships_relid_idx on 
org_relationships_join j2  (cost=0.00..21.00 rows=7 width=8) (actual 
time=0.014..0.068 rows=35 loops=1)
 Index Cond: (j2.relid = j1.relid)
 ->  Index Scan using users_orgid_idx on users u  (cost=0.00..129.52 
rows=5 width=271) (actual time=3.126..3.305 rows=0 loops=35)
   Index Cond: (u.orgid = j2.orgid)
   Filter: ((u.type_id < 10) AND (u.deleted = 0) AND 
((lower((u.lname)::text) ~~ 'boat%'::text) OR (lower((u.fname)::text) ~~ 
'boat%'::text) OR (lower((u.username)::text) ~~ 'boat%'::text) OR 
(lower(u.option1) ~~ 'boat%'::text) OR (lower((u.email)::text) ~~ 
'%boat%'::text) OR (lower(u.external_id) = 'boat'::text)))
 Total runtime: 116.214 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] slow query

2009-01-12 Thread bricklen
On Mon, Jan 12, 2009 at 2:59 PM, Scott Marlowe  wrote:
> OK, I've got a query that's running slow the first time, then fast.
> But I can't see where the time is really being spend on the first run.
>  Query and plan attached to preserve formatting.
>
> The index scan and nested loop that feed the next layer up nested loop
> both show very short run times.  Yet the nested loop they feed to
> takes 30 seconds to run.  If I run the query a second time, everything
> looks the same but the second nested loop now runs in well under a
> second.
>
> I can't figure out where my time's going to.

If it is any help, there is a nice tool to format your explain plan at
http://explain.depesz.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] slow query

2009-01-12 Thread Gregory Williamson
Scott Marlowe wrote:

> 
> OK, I've got a query that's running slow the first time, then fast.
> But I can't see where the time is really being spend on the first run.
> Query and plan attached to preserve formatting.

Often this is from caching -- the first time the system has to go to disk to 
get the values; the subsequent times the data (and indexes, presumably) are all 
in RAM and so much faster.

Is this plausible ?

Greg Williamson
Senior DBA
DigitalGlobe

Confidentiality Notice: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and may contain confidential and 
privileged information and must be protected in accordance with those 
provisions. Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please contact the sender by 
reply e-mail and destroy all copies of the original message.

(My corporate masters made me say this.)



Re: [PERFORM] slow query

2009-01-12 Thread Gregory Stark

"Scott Marlowe"  writes:

>  ->  Index Scan using users_orgid_idx on users u  (cost=0.00..129.52 
> rows=5 width=271) (actual time=843.825..860.638 rows=0 loops=35)
>Index Cond: (u.orgid = j2.orgid)
>Filter: ((u.type_id < 10) AND (u.deleted = 0) AND 
> ((lower((u.lname)::text) ~~ 'boat%'::text) OR (lower((u.fname)::text) ~~ 
> 'boat%'::text) OR (lower((u.username)::text) ~~ 'boat%'::text) OR 
> (lower(u.option1) ~~ 'boat%'::text) OR (lower((u.email)::text) ~~ 
> '%boat%'::text) OR (lower(u.external_id) = 'boat'::text)))

Not sure if this is what's going on but I find the high startup time for this
index scan suspicious. Either there are a lot of dead tuples (which would
explain the second run being fast if it marks them all as lp_dead) or there
are a lot of matching index pointers which fail those other constraints.
Assuming it's the latter perhaps some other index definition would let it zero
in on the right tuples more quickly instead of having to grovel through a lot
of irrelevant rows?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] slow query

2009-01-12 Thread David Wilson
On Mon, Jan 12, 2009 at 5:59 PM, Scott Marlowe  wrote:
> I can't figure out where my time's going to.

Looks like it's going to:

->  Index Scan using users_orgid_idx on users u  (cost=0.00..129.52
rows=5 width=271) (actual time=843.825..860.638 rows=0 loops=35)

I'd guess the index/pages for users isn't in memory the first time around.

Next time is:

->  Index Scan using users_orgid_idx on users u  (cost=0.00..129.52
rows=5 width=271) (actual time=3.126..3.305 rows=0 loops=35)

-- 
- David T. Wilson
david.t.wil...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] slow query

2009-01-12 Thread Scott Marlowe
On Mon, Jan 12, 2009 at 4:55 PM, David Wilson  wrote:
> On Mon, Jan 12, 2009 at 5:59 PM, Scott Marlowe  
> wrote:
>> I can't figure out where my time's going to.
>
> Looks like it's going to:
>
> ->  Index Scan using users_orgid_idx on users u  (cost=0.00..129.52
> rows=5 width=271) (actual time=843.825..860.638 rows=0 loops=35)
>
> I'd guess the index/pages for users isn't in memory the first time around.

Exactly.  I keep forgetting to look at loops... sigh.  Thanks!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Slow insert performace, 8.3 Wal related?

2009-01-12 Thread Bill Preston

Hi Group.
Recently upgraded from 8.1 to 8.3 on RHEL 5 64-bit.

I've noticed some performance problems that I am guessing are WAL 
related based on my browsing around and wondered if someone had some 
suggestions for tuning the WAL settings. It could also help if someone 
just laughed at me and told me it wasn't WAL.


I have narrowed the problem down to two pretty simple descriptions.

I had a  data load that I was doing with 8.1. It involved about 250k sql 
statements that were inserts into a table with just one index. The index 
has two fields.
With the upgrade to 8.3 that process started taking all night and 1/2 a 
day. It inserted at the rate of 349 records a minute.
When I started working on the problem I decided to test by putting all 
statements withing a single transaction. Just a simple BEGIN at the 
start and COMMIT at the end. Magically it only took 7 minutes to do the 
whole set, or 40k per minute. That seemed very odd to me, but at least I 
solved the problem.


The most recently noticed simple problem.
I had  a table with about 20k records. We issued the statement DELETE 
FROM table where this=that.
This was part of a combined index and about 8k records should have been 
deleted.
This statement caused all other queries to grind to a halt. It was only 
when I killed it that normal operation resumed. It was acting like a 
lock, but that table was not being used by any other process.


So that describes what I am seeing, let me relay what we are doing with 
what I think to be the relevant settings.


For the log shipping, I am using scp to send the logs to a separate 
server. And yes they are getting sent.
I have it set now to send the log about every two minutes since I am 
comfortable with that amount of data loss. Here are the settings from 
the log file that are not commented out relating to WAL. (everything 
below WRITE AHEAD LOG section in the default config file)


synchronous_commit = off
checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 5min   # range 30s-1h
checkpoint_completion_target = 0.5  # checkpoint target duration, 
0.0 - 1.0

checkpoint_warning = 30s# 0 is off
archive_mode = on   # allows archiving to be done
archive_command = '/var/lib/pgsql/data/logship.sh %f %p' 
archive_timeout = 120   # force a logfile segment switch after this


Thanks for any help (or laughter)

Rusty



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow insert performace, 8.3 Wal related?

2009-01-12 Thread Tom Lane
Bill Preston  writes:
> I've noticed some performance problems that I am guessing are WAL 
> related based on my browsing around and wondered if someone had some 
> suggestions for tuning the WAL settings. It could also help if someone 
> just laughed at me and told me it wasn't WAL.

Consider it done ;-).  I'm not sure what your problem is but it's
unlikely to be WAL, especially not if you're using the same WAL-related
settings in 8.1 and 8.3.

Which you might not be.  The large speedup from wrapping many small
inserts into one transaction is entirely expected and should have
occurred on 8.1 as well.  I am suspicious that you were running 8.1 with
fsync off and 8.3 has it on.  Do you still have your 8.1
postgresql.conf?  Comparing all the non-defaulted settings would be the
first thing to do.

If it's not that, I'm not sure.  One cross-version difference that comes
to mind is that 8.3 is a bit stickier about implicit casting, and so it
seems conceivable that something about these queries was considered
indexable in 8.1 and is not in 8.3.  But you've not provided enough
detail to do more than speculate.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow insert performace, 8.3 Wal related?

2009-01-12 Thread Alan Hodgson
On Monday 12 January 2009, Bill Preston  
wrote:
> I had a  data load that I was doing with 8.1. It involved about 250k sql
> statements that were inserts into a table with just one index. The index
> has two fields.
> With the upgrade to 8.3 that process started taking all night and 1/2 a
> day. It inserted at the rate of 349 records a minute.
> When I started working on the problem I decided to test by putting all
> statements withing a single transaction. Just a simple BEGIN at the
> start and COMMIT at the end. Magically it only took 7 minutes to do the
> whole set, or 40k per minute. That seemed very odd to me, but at least I
> solved the problem.
>

That's well-known behaviour. If you don't do them in one big transaction, 
PostgreSQL has to fsync after every insert, which effectively limits your 
insert rate to the rotational speed of your WAL drive (roughly speaking). 
If you don't explicitly start and end transactions, PostgreSQL does it for 
you. For every statement.

> The most recently noticed simple problem.
> I had  a table with about 20k records. We issued the statement DELETE
> FROM table where this=that.
> This was part of a combined index and about 8k records should have been
> deleted.
> This statement caused all other queries to grind to a halt. It was only
> when I killed it that normal operation resumed. It was acting like a
> lock, but that table was not being used by any other process.

Are there foreign keys on any other table(s) that point to this one? Are the 
relevant columns in those tables indexed?


-- 
Current Peeve: The mindset that the Internet is some sort of school for
novice sysadmins and that everyone -not- doing stupid dangerous things
should act like patient teachers with the ones who are. -- Bill Cole, NANAE 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5

2009-01-12 Thread Mark Wong
On Mon, Dec 22, 2008 at 7:27 AM, Kevin Grittner
 wrote:
 "Mark Wong"  wrote:
>
>> The DL380 G5 is an 8 core Xeon E5405 with 32GB of
>> memory.  The MSA70 is a 25-disk 15,000 RPM SAS array, currently
>> configured as a 25-disk RAID-0 array.
>
>> number of connections (250):
>
>> Moving forward, what other parameters (or combinations of) do people
>> feel would be valuable to illustrate with this workload?
>
> To configure PostgreSQL for OLTP on that hardware, I would strongly
> recommend the use of a connection pool which queues requests above
> some limit on concurrent queries.  My guess is that you'll see best
> results with a limit somewhere aound 40, based on my tests indicating
> that performance drops off above (cpucount * 2) + spindlecount.

It appears to peak around 220 database connections:

http://pugs.postgresql.org/node/514

Of course the system still isn't really tuned all that much...  I
wouldn't be surprised if the workload peaked at a different number of
connections as it is tuned more.

> I wouldn't consider tests of the other parameters as being very useful
> before tuning this.  This is more or less equivalent to the "engines"
> configuration in Sybase, for example.

Regards,
Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow insert performace, 8.3 Wal related?

2009-01-12 Thread Bill Preston
Wow thanks for all the help Tom and Alan. Sadly I was un-aware of the 
well-known behavior. Consider it more will known now.


Fsync is off in 8.3.
I am not too worried about what was before in 8.1 since we are not going 
back.


For the first example (bad behavior when I am not using transactions).
Is there anyway to tell that it is going on at a given point and time? 
Is their a buffer that fills up, a stat somewhere that I can read?
A lot of our code isn't using transactions yet so I would like a heads 
up when this problem is happening or if possible increase some parameter 
so it happens less.


As to the second example with the delete. There are no foreign keys.
For the index. If the table has fields a,b,c and d.
We have a btree index (a,b,c,d)
and we are saying DELETE FROM table_messed_up WHERE a=x.

So the WHERE statement is the first field in the the index.

Now that you have given me more knowledge, let me ask a question that 
might lead to the answer.


Example 1 happens in isolation.
Example 2 happened on a live system with the parameters that I specified 
and a whole lot of sql statements without transactions being run at the 
same time. In fact their probably was a whole lot of inserts on this 
very table before the delete statement was hit.


Is it possible that a problem like Example 1 caused the behavior that I 
witnessed in Example 2? It was waiting for the WAL's to catch up or 
something?


Thanks

Rusty


Alan Hodgson wrote:
On Monday 12 January 2009, Bill Preston  
wrote:
  

I had a  data load that I was doing with 8.1. It involved about 250k sql
statements that were inserts into a table with just one index. The index
has two fields.
With the upgrade to 8.3 that process started taking all night and 1/2 a
day. It inserted at the rate of 349 records a minute.
When I started working on the problem I decided to test by putting all
statements withing a single transaction. Just a simple BEGIN at the
start and COMMIT at the end. Magically it only took 7 minutes to do the
whole set, or 40k per minute. That seemed very odd to me, but at least I
solved the problem.




That's well-known behaviour. If you don't do them in one big transaction, 
PostgreSQL has to fsync after every insert, which effectively limits your 
insert rate to the rotational speed of your WAL drive (roughly speaking). 
If you don't explicitly start and end transactions, PostgreSQL does it for 
you. For every statement.


  

The most recently noticed simple problem.
I had  a table with about 20k records. We issued the statement DELETE
FROM table where this=that.
This was part of a combined index and about 8k records should have been
deleted.
This statement caused all other queries to grind to a halt. It was only
when I killed it that normal operation resumed. It was acting like a
lock, but that table was not being used by any other process.



Are there foreign keys on any other table(s) that point to this one? Are the 
relevant columns in those tables indexed?



  



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance