[PERFORM] query using incorrect index

2012-08-02 Thread Russell Keane

Using PG 9.0 and given 2 queries (messageq_current is a view on the 
messageq_table):

select entity_id from messageq_current
where entity_id = 123456;

select entity_id from messageq_current
where incoming = true
and inactive = false
and staff_ty = 2
and staff_id = 2
order by entity_id desc
limit 1;

and 2 indexes (there are 15 indexes in total but they are left out here for 
brevity):

messageq1:
CREATE INDEX messageq1
  ON messageq_table
  USING btree
  (entity_id);

And messageq4:

CREATE INDEX messageq4
  ON messageq_table
  USING btree
  (inactive, staff_ty, staff_id, incoming, tran_dt);

With the messageq1 index present, query 1 is very quick (0.094ms) and query 2 
is very slow (241.515ms).
If I remove messageq1 then query 2 uses messageq4 and is very quick (0.098ms) 
but then query 1 must use a different index and is therefore slower (> 5ms).

So, to the Query plans:
With messageq1:
"Limit  (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481 
rows=0 loops=1)"
"  Output: messageq_table.entity_id"
"  Buffers: shared hit=32 read=18870 written=12"
"  ->  Index Scan Backward using messageq1 on prac_live_10112.messageq_table  
(cost=0.00..66762.53 rows=25 width=4) (actual time=241.479..241.479 rows=0 
loops=1)"
"Output: messageq_table.entity_id"
"Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND 
(messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND 
(aud_status_to_flag(messageq_table.aud_status) = 1))"
"Buffers: shared hit=32 read=18870 written=12"
"Total runtime: 241.515 ms"

Without messageq1:
"Limit  (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055 
rows=0 loops=1)"
"  Output: messageq_table.entity_id"
"  Buffers: shared read=3"
"  ->  Sort  (cost=12534.45..12534.51 rows=25 width=4) (actual 
time=0.054..0.054 rows=0 loops=1)"
"Output: messageq_table.entity_id"
"Sort Key: messageq_table.entity_id"
"Sort Method:  quicksort  Memory: 17kB"
"->  Bitmap Heap Scan on prac_live_10112.messageq_table  
(cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0 
loops=1)"
"  Output: messageq_table.entity_id"
"  Recheck Cond: ((messageq_table.staff_ty = 2) AND 
(messageq_table.staff_id = 2))"
"  Filter: (messageq_table.incoming AND (NOT 
messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 
1))"
"  Buffers: shared read=3"
"  ->  Bitmap Index Scan on messageq4  (cost=0.00..174.08 rows=4920 
width=0) (actual time=0.040..0.040 rows=0 loops=1)"
"Index Cond: ((messageq_table.inactive = false) AND 
(messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND 
(messageq_table.incoming = true))"
"Buffers: shared read=3"
"Total runtime: 0.098 ms"

Clearly the statistics are off somehow but I really don't know where to start.

Any help you can give me would be very much appreciated.

Regards,


Russell Keane

INPS


Tel: +44 (0)20 7501 7277

Subscribe to the Vision e-newsletter
Subscribe to the Helpline Support 
Bulletin
[cid:image003.png@01CD70C7.26264EE0]  Subscribe to the Helpline Blog RSS 
Feed



Registered name: In Practice Systems Ltd.
Registered address: The Bread Factory, 1a Broughton Street, London, SW8 3QJ
Registered Number: 1788577
Registered in England
Visit our Internet Web site at www.inps.co.uk
The information in this internet email is confidential and is intended solely 
for the addressee. Access, copying or re-use of information in it by anyone 
else is not authorised. Any views or opinions presented are solely those of the 
author and do not necessarily represent those of INPS or any of its affiliates. 
If you are not the intended recipient please contact is.helpd...@inps.co.uk

<>

[PERFORM] [repost] Help me develop new commit_delay advice

2012-08-02 Thread Peter Geoghegan
This has been reposted to this list from the pgsql-hackers list, at
the request of Josh Berkus. Hopefully there will be more interest
here.

-- Forwarded message --
From: Peter Geoghegan 
Date: 29 July 2012 16:39
Subject: Help me develop new commit_delay advice
To: PG Hackers 


Many of you will be aware that the behaviour of commit_delay was
recently changed. Now, the delay only occurs within the group commit
leader backend, and not within each and every backend committing a
transaction:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f11e8be3e812cdbbc139c1b4e49141378b118dee

For those of you that didn't follow this development, I should point
out that I wrote a blogpost that described the idea, which will serve
as a useful summary:

http://pgeoghegan.blogspot.com/2012/06/towards-14000-write-transactions-on-my.html

I made what may turn out to be a useful observation during the
development of the patch, which was that for both the tpc-b.sql and
insert.sql pgbench-tools scripts, a commit_delay of half of my
wal_sync_method's reported raw sync speed looked optimal. I use Linux,
so my wal_sync_method happened to have been fdatasync. I measured this
using pg_test_fsync.

The devel docs still say of commit_delay and commit siblings: "Good
values for these parameters are not yet clear; experimentation is
encouraged". This has been the case since Postgres 7.1 (i.e. it has
never been clear what good values were - the folk wisdom was actually
that commit_delay should always be set to 0). I hope to be able to
formulate some folk wisdom about setting commit_delay from 9.3 on,
that may go on to be accepted as an official recommendation within the
docs.

I am rather curious as to what experimentation shows optimal values
for commit_delay to be for a representative cross-section of hardware.
In particular, I'd like to see if setting commit_delay to half of raw
sync time appears to be optimal for both insert.sql and tpc-b.sql
workloads across different types of hardware with different sync
times. Now, it may be sort of questionable to take those workloads as
general proxies for performance, not least since they will literally
give Postgres as many *completely uniform* transactions as it can
handle. However, it is hard to think of another, better general proxy
for performance that is likely to be accepted as such, and will allows
us to reason about setting commit_delay.

While I am not completely confident that we can formulate a widely
useful, simple piece of advice, I am encouraged by the fact that a
commit_delay of 4,000 worked very well for both tpc-b.sql and
insert.sql workloads on my laptop, beating out settings of 3,000 and
5,000 on each benchmark. I am also encouraged by the fact that in some
cases, including both the insert.sql and tpc-b.sql cases that I've
already described elsewhere, there is actually no downside to setting
commit_delay - transaction throughput naturally improves, but
transaction latency is actually improved a bit too (or at least the
average and worst-cases). This is presumably due to the amelioration
of resource contention (from greater commit batching) more than
compensating for the obvious downside of adding a delay.

It would be useful, for a start, if I had numbers for a battery-backed
write cache. I don't have access to one right now though, nor do I
have access to any more interesting hardware, which is one reason why
I'm asking for help with this.

I like to run "sync" prior to running pg_test_fsync, just in case.

[peter@peterlaptop pg_test_fsync]$ sync

I then interpret the following output:

[peter@peterlaptop pg_test_fsync]$ pg_test_fsync
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 112.940 ops/sec
fdatasync 114.035 ops/sec
fsync 21.291 ops/sec
*** SNIP ***

So if I can perform 114.035 8KiB sync operations per second, that's an
average of about 1 per 8.77 milliseconds, or 8770 microseconds to put
it in the units that commit_delay speaks. It is my hope that we will
find that when this number is halved, we will arrive at a figure that
is worth recommending as a general useful setting for commit_delay for
the system. I guess I could gain some additional insight by simply
changing my wal_sync_method, but I'd find it more interesting to look
at organic setups with faster (not slower) sync times than my system's
fdatasync. For those who are able to help me here, I'd like to see
pgbench-tools workloads for both tpc-b.sql and insert.sql with
incrementing values of commit_delay (increments of, say, 1000
microseconds, perhaps with less granularity where it isn't needed),
from 0 to $(1.5 times raw sync speed) microseconds.

Thanks
-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
Postgr

Re: [PERFORM] query using incorrect index

2012-08-02 Thread Kevin Grittner
Russell Keane  wrote:
 
> Clearly the statistics are off somehow but I really don't know
> where to start.
> 
> Any help you can give me would be very much appreciated.
 
It would help to know your more about your hardware and PostgreSQL
configuration.  The latter can probably best be communicated by
copy/paste of the results of the query on this page:
 
http://wiki.postgresql.org/wiki/Server_Configuration
 
Can you also post the EXPLAIN ANALYZE output for the slow query with
both indexes present but without the LIMIT clause?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance