[PERFORM] Wrong docs on wal_buffers?

2011-01-05 Thread Josh Berkus
All,

One of my coworkers just pointed this out:

"The amount of memory used in shared memory for WAL data. The default is
64 kilobytes (64kB). The setting need only be large enough to hold the
amount of WAL data generated by one typical transaction, since the data
is written out to disk at every transaction commit. This parameter can
only be set at server start."
http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html

That's quite incorrect.  The wal_buffers are shared by all concurrent
transactions, so it needs to be sized appropriately for all
*simultaneous* uncommitted transactions, otherwise you'll get
unnecessary flushing.

Certainly performance testing data posted on this list and -hackers.
bears that out.  My suggestion instead:

"The amount of shared memory dedicated to buffering writes to the WAL.
The default is 64 kilobytes (64kB), which is low for a busy production
server.  Users who have high write concurrency, or transactions which
commit individual large data writes, will want to increase it to between
1MB and 16MB. This parameter can only be set at server start."

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Wrong docs on wal_buffers?

2011-01-05 Thread Jeff Janes
On Wed, Jan 5, 2011 at 12:43 PM, Josh Berkus  wrote:
> All,
>
> One of my coworkers just pointed this out:
>
> "The amount of memory used in shared memory for WAL data. The default is
> 64 kilobytes (64kB). The setting need only be large enough to hold the
> amount of WAL data generated by one typical transaction, since the data
> is written out to disk at every transaction commit. This parameter can
> only be set at server start."
> http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html
>
> That's quite incorrect.  The wal_buffers are shared by all concurrent
> transactions, so it needs to be sized appropriately for all
> *simultaneous* uncommitted transactions, otherwise you'll get
> unnecessary flushing.

I'd thought the same thing in the past.  But on further thinking about
it, I had decided otherwise.

On a highly concurrent system, transaction commits are constantly and
unavoidably writing and flushing other transactions' WAL.

If the transactions are well spread out, each of N concurrent
homogeneous transactions only has 1/N of its total WAL in shared
buffers at any one time, so the total does come out to about 1/N * N =
1 typical transaction size.  Throw in stochastic departures from
uniform distribution, and it would be somewhat higher, but not N.

Only if all the transactions move through the system in lock-step,
would need N times the typical size for one transaction.  pgbench can
create this condition, but I don't know how likely it is for
real-world work flows to do so.  Maybe it is common there as well?

But my bigger objection to the original wording is that it is very
hard to know how much WAL a typical transaction generates, especially
under full_page_writes.


And the risks are rather asymmetric.  I don't know of any problem from
too large a buffer until it starts crowding out shared_buffers, while
under-sizing leads to the rather drastic performance consequences of
AdvanceXLInsertBuffer having to wait on the WALWriteLock while holding
the WALInsertLock,


>
> Certainly performance testing data posted on this list and -hackers.
> bears that out.  My suggestion instead:
>
> "The amount of shared memory dedicated to buffering writes to the WAL.
> The default is 64 kilobytes (64kB), which is low for a busy production
> server.  Users who have high write concurrency, or transactions which
> commit individual large data writes, will want to increase it to between
> 1MB and 16MB. This parameter can only be set at server start."

I like this wording.

But I wonder if initdb.c, when selecting the default shared_buffers,
shouldn't test with wal_buffers = shared_buffers/64 or
shared_buffers/128, with a lower limit of 8 blocks, and set that as
the default.

Cheers,

Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] plan question - query with order by and limit not choosing index depends on size of limit, table

2011-01-05 Thread Mike Broers
Hello performance, I need help explaining the performance of a particular
query:

select * from messages where ((messages.topic = E'/x') AND
(messages.processed = 'f'))  ORDER BY messages.created_at ASC limit 10;


Table Structure:

  Column   |Type |
Modifiers
+-+
 id | integer | not null default
nextval('landing_page.messages_id_seq'::regclass)
 processed  | boolean |
 topic  | character varying(255)  |
 body   | text|
 created_at | timestamp without time zone |
 updated_at | timestamp without time zone |
Indexes:
"messages_pkey" PRIMARY KEY, btree (id)
"idx_landing_page_messages_created_at" btree (created_at)
"idx_messages_topic_processed" btree (topic, processed)


Table row count ~ 1million

When I run the query with limit 10 it skips the
idx_messages_topic_processed.
When I run the query with no limit, or with a limit above 20 it uses the
desired index.
On a different system with a much smaller data set (~200,000) i have to use
a limit of about 35 to use the desired index.

this is the good plan with no limit or 'sweet spot' limit

 Limit  (cost=2050.29..2050.38 rows=35 width=1266)
   ->  Sort  (cost=2050.29..2052.13 rows=737 width=1266)
 Sort Key: created_at
 ->  Bitmap Heap Scan on messages  (cost=25.86..2027.70 rows=737
width=1266)
   Recheck Cond: ((topic)::text = 'x'::text)
   Filter: (NOT processed)
   ->  Bitmap Index Scan on idx_messages_topic_processed
 (cost=0.00..25.68 rows=737 width=0)
 Index Cond: (((topic)::text = '/x'::text) AND
(processed = false))

This is the bad plan with limit 10
 Limit  (cost=0.00..1844.07 rows=30 width=1266)
   ->  Index Scan using idx_landing_page_messages_created_at on messages
 (cost=0.00..45302.70 rows=737 width=1266)
 Filter: ((NOT processed) AND ((topic)::text = 'x'::text))


Not sure if cost has anything to do with it, but this is set in
postgresql.conf.  I am hesitant to change this as I have inherited the
database from a previous dba and dont want to adversely affect things that
caused this to be set in a non default manner if possible.

#seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 3.0 # same scale as above



Why does the smaller limit cause it to skip the index?
Is there a way to help the planner choose the better plan?

Much appreciated,
Mike


Re: [PERFORM] Wrong docs on wal_buffers?

2011-01-05 Thread Pierre C



And the risks are rather asymmetric.  I don't know of any problem from
too large a buffer until it starts crowding out shared_buffers, while
under-sizing leads to the rather drastic performance consequences of
AdvanceXLInsertBuffer having to wait on the WALWriteLock while holding
the WALInsertLock,


Suppose you have a large update which generates lots of WAL, some WAL  
segment switching will take place, and therefore some fsync()s. If  
wal_buffers is small enough that it fills up during the time it takes to  
fsync() the previous WAL segment, isn't there a risk that all WAL writes  
are stopped, waiting for the end of this fsync() ?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] plan question - query with order by and limit not choosing index depends on size of limit, table

2011-01-05 Thread Kevin Grittner
Mike Broers  wrote:
 
> Hello performance, I need help explaining the performance of a
> particular query
 
You provided some of the information needed, but you should review
this page and post a bit more:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
In particular, post the result of EXPLAIN ANALYZE, not just EXPLAIN.
Also, showing all overrides in your postgresql.conf file is
important, and some information about your hardware.  How big is the
active portion of your database (the frequently read portion)?
 
> Why does the smaller limit cause it to skip the index?
 
Because the optimizer thinks the query will return rows sooner that
way.
 
> Is there a way to help the planner choose the better plan?
 
You might get there by adjusting your memory settings and/or costing
settings, but we need to see more information to know that.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance