[PERFORM] Dump performance problems following server crash

2006-12-02 Thread Kim

Hello Performance,

Yesterday, with help from the admin list, we took care of a problem we 
were having with creating tables/views/indexes following a server 
overheat & crash (an index on pg_attribute was corrupted, causing the 
create to hang and go through the roof on memory usage until it failed 
out - a reindex fixed it like charm). Following the repair of creates 
(we also took the system into single user to run reindex system), and 
with no other known problems with the db itself, we immediately began a 
dump of the database.


Typical dump time: ~12 hours, we dump overnight but there is still 
decently heavy activity. However, this dump has the box to itself and 
after 10 hours we are only about 20% done just with pulling schema for 
the indexes - something that typically takes it 4-6 hours to complete 
all schema entirely. Load on the machine is minimal, along with memory 
usage by the dump process itself (864M, not large for this system). It 
is definitely moving, but just very slowly. At this point we are 
attempting to determine if this is a machine level problem (which we 
haven't seen sign of yet) or still a potential problem in postgres. The 
dump is currently doing I/O at 10mbps, but in testing our sys admin 
reports he has no problem getting stronger I/O stats from other processes.


The current dump query running:
SELECT t.tableoid, t.oid, t.relname as indexname, 
pg_catalog.pg_get_indexdef(i.indexrelid) as indexdef, t.relnatts as 
indnkeys, i.indkey, i.indisclustered, c.contype, c.conname, c.tableoid 
as contableoid, c.oid as conoid, (SELECT spcname FROM 
pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) as tablespace 
FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = 
i.indexrelid) LEFT JOIN pg_catalog.pg_depend d ON (d.classid = 
t.tableoid AND d.objid = t.oid AND d.deptype = 'i') LEFT JOIN 
pg_catalog.pg_constraint c ON (d.refclassid = c.tableoid AND d.refobjid 
= c.oid) WHERE i.indrelid = '44240'::pg_catalog.oid ORDER BY indexname


Amount of time it took me to run the query from console: ~5secs (I'm 
counting in my head, sophisticated, eh?)


We tend to respond to slow queries with vacs and analyzes, but 
considering these are system tables that have recently been reindexed, 
how likely is it that we could improve things by doing a vac? At this 
point we plan to halt the dump and run a vac full on the db, but any 
ideas you may have as to why the dump is sluggish on getting this 
information, I'd appreciate them.


spec info -
Postgres 8.1.4
db size: 200+ GB
101,745 tables
314,821 indexes
1,569 views
maintenance_work_mem = 262144

Server:
OS: Solaris 10
Sunfire X4100 XL
2x AMD Opteron Model 275 dual core procs
8GB of ram
(this server overheated & crashed due to a cooling problem at the 
hosting service)


On top of a:
Sun Storedge 6130
14x 146GB Drives in a Raid 5
Brocade 200E switches
Emulex 4gb HBAs
(this server had no known problems)

Thanks in advance,
Kim Hatcher <http://www.myemma.com>


Re: [PERFORM] Dump performance problems following server crash

2006-12-02 Thread Kim
We dropped into single user mode and ran reindex system - it was my 
understanding this would reindex them all, including shared catalogs - 
but perhaps not?


Kim

Tom Lane wrote:


Kim <[EMAIL PROTECTED]> writes:
 


The current dump query running:
SELECT t.tableoid, t.oid, t.relname as indexname, 
pg_catalog.pg_get_indexdef(i.indexrelid) as indexdef, t.relnatts as 
indnkeys, i.indkey, i.indisclustered, c.contype, c.conname, c.tableoid 
as contableoid, c.oid as conoid, (SELECT spcname FROM 
pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) as tablespace 
FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = 
i.indexrelid) LEFT JOIN pg_catalog.pg_depend d ON (d.classid = 
t.tableoid AND d.objid = t.oid AND d.deptype = 'i') LEFT JOIN 
pg_catalog.pg_constraint c ON (d.refclassid = c.tableoid AND d.refobjid 
= c.oid) WHERE i.indrelid = '44240'::pg_catalog.oid ORDER BY indexname
   



 

Amount of time it took me to run the query from console: ~5secs (I'm 
counting in my head, sophisticated, eh?)
   



Even 5 seconds is way too long.  You've apparently still got something
corrupted somewhere.  Did you reindex *all* the system catalogs?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

 



--

*kim hatcher*
senior developer, emma®
e: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
p: 800 595 4401
w: www.myemma.com <http://www.myemma.com>


[PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Kim

Hello all!
Running a vac on an 8.2 client connecting to an 8.2 database (following 
example was for a 14 row static table) - vacuums will sit (for lack of a 
better word) for anywhere from ten to twenty minutes before taking a 
lock out on the table and doing work there. Once the vacuum does 
complete, I noticed that no timestamp is registered in 
pg_stat_all_tables for that relation for the last-vacuum'd timestamp 
(however analyze does seem to set it's timestamp). I asked it to run a 
vacuum on an index (knowing it would fail out), and again, the vacuum 
sat for several minutes before finally erroring out saying that it 
couldn't vacuum an index. Out of curiosity I tried the vacuum on an 8.1 
client connected to the 8.2 db, same delay.


In running a truss on the process while it is running, there is over 
five minutes where the process seems to be scanning pg_class (at least 
thats the only table listed in pg_locks for this process). Following 
this it drops into a cycle of doing the same send() command with several 
seconds lag between each one, and every so often it catches the same 
interrupt (SIGUSR1) and then goes back into the same cycle of send() 
calls. Also, whatever it is doing during this stage, it isn't checking 
for process-cancelled interrupts, as the process won't recognize it's 
been requested to cancel until it breaks out of this cycle of send()s 
and SIGUSR1s (which can go for another several minutes). I'm happy to 
send along the gore of the truss call if you think it would be helpful...


Any ideas what the vac is prepping for that it could become bogged down 
in before finally taking the lock on the table?


Is the lack of a timestamp set for last_vacuum in pg_stat_all_tables an 
indication that there may be something incomplete about our install?


Since the upgrade, we've also seen unusual lag time in simple inserts 
into tables (atomic inserts have been seen running for several seconds), 
and also extreme delays in running \d on tables (I got tired of counting 
past 2 minutes, connecting with an 8.1 client gives immediate response 
on this command). We plan to upgrade to 8.2.1 as soon as possible, and 
also to drop into single user mode and run a reindex system, but any 
suggestions in the meantime as to a potential cause or a way to further 
debug the vacs would be greatly appreciated.


OS: Solaris 10
write transactions/hr: 1.5 million
size of pg_class: 535,226
number of relations: 108,694

Thanks to all,

Kim

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Kim

Hey Tom,
We were running on 8.1.1 previous to upgrading to 8.2, and yes, we 
definitely have a heafty pg_class. The inheritance model is heavily used 
in our schema (the results of the group by you wanted to see are down 
below).  However, no significant problems were seen with vacs while we 
were on 8.1. Execution time for the vac seemed more linked to large 
table size and how active the table was with updates, rather than being 
universally over 10 minutes regardless of the vac's object. We will be 
doing an audit of our 8.2 install to try and make sure that it looks 
like a complete install, any tests you can think of that may further 
narrow things down for us?




relkind | count
-+
v   |   1740
t   |  49986
c   |  4
S   | 57
r   | 108689
i   | 374723
(6 rows)



Tom Lane wrote:


Kim <[EMAIL PROTECTED]> writes:
 


size of pg_class: 535,226
number of relations: 108,694
   



Oh, I shoulda read all the way to the bottom of your email :-(.  What
version of PG were you running before?  I would think that pretty much
any version of pgstat_vacuum_tabstats would have had a performance issue
with pg_class that large.  Also, could we see

select relkind, count(*) from pg_class group by relkind;

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

 



Re: [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Kim
For 8.1, we did have stats_block_level and stats_row_level on, so thats 
not it either :-/ However, I did go on to an alternate database of ours 
on the same machine, using the same install, same postmaster - that 
holds primarily static relations, and not many of those (16 relations 
total). The response of running a vac for a 1.3k static table was quick 
(6 seconds - but it still did not set the last_vacuum field). Not sure 
why we weren't seeing more probs with this on 8.1 for the full db, but 
from the looks of things I think your theory on the primary problem with 
our vacs is solid. I'm hoping we can fire up our old 8.1 dataset and run 
some tests on there to confirm/reject the idea that it was doing any 
better, but that will require quieter times on the machine than we've 
got right now :)


We are going to try and upgrade to 8.2.1 as soon as we can, and if we 
continue to see some of the other problems I mentioned as side-notes, 
we'll build some information on those and pass it along...


Thanks so much!

Kim


Tom Lane wrote:


Kim <[EMAIL PROTECTED]> writes:
 

We were running on 8.1.1 previous to upgrading to 8.2, and yes, we 
definitely have a heafty pg_class. The inheritance model is heavily used 
in our schema (the results of the group by you wanted to see are down 
below).  However, no significant problems were seen with vacs while we 
were on 8.1.
   



Odd, because the 8.1 code looks about the same, and it is perfectly
obvious in hindsight that its runtime is about O(N^2) in the number of
relations :-(.  At least that'd be the case if the stats collector
output were fully populated.  Did you have either stats_block_level or
stats_row_level turned on in 8.1?  If not, maybe the reason for the
change is that in 8.2, that table *will* be pretty fully populated,
because now it's got a last-vacuum-time entry that gets made even if the
stats are otherwise turned off.  Perhaps making that non-disablable
wasn't such a hot idea :-(.

What I think we need to do about this is

(1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
of using a hash table for the OIDs instead of a linear list.  Should be
a pretty small change; I'll work on it today.

(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.

Comments from hackers?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

 



Re: [PERFORM] unusual performance for vac following 8.2 upgrade

2007-01-11 Thread Kim
Our pgstats.stat file is 40M for 8.2, on 8.1 it was 33M. Our schema size 
hasn't grown *that* much in the two weeks since we upgraded


I'm not sure if this sheds any more light on the situation, but in 
scanning down through the process output from truss, it looks like the 
first section of output was a large chunk of reads on pgstat.stat, 
followed by a larger chunk of reads on the global directory and 
directories under base - this whole section probably went on for a good 
6-7 minutes, though I would say the reads on pgstat likely finished 
within a couple of minutes or so. Following this there was a phase were 
it did a lot of seeks and reads on files under pg_clog, and it was while 
doing this (or perhaps it had finished whatever it wanted with clogs) it 
dropped into the send()/SIGUSR1 loop that goes for another several minutes.


Kim


Tom Lane wrote:


I wrote:
 


(2) Reconsider whether last-vacuum-time should be sent to the collector
unconditionally.
   



Actually, now that I look, the collector already contains this logic:

   /*
* Don't create either the database or table entry if it doesn't already
* exist.  This avoids bloating the stats with entries for stuff that is
* only touched by vacuum and not by live operations.
*/

and ditto for analyze messages.  So my idea that the addition of
last-vac-time was causing an increase in the statistics file size
compared to 8.1 seems wrong.

How large is your $PGDATA/global/pgstat.stat file, anyway?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

 



Re: [HACKERS] [PERFORM] unusual performance for vac following 8.2

2007-01-17 Thread Kim

Hello again Tom,

We have our upgrade to 8.2.1 scheduled for this weekend, and we noticed 
your message regarding the vacuum patch being applied to 8.2 and 
back-patched. I expect I know the answer to this next question :) but I 
was wondering if the patch referenced below has also been bundled into 
the normal source download of 8.2.1 or if we would still need to 
manually apply it?


- Fix a performance problem in databases with large numbers of tables
 (or other types of pg_class entry): the function
 pgstat_vacuum_tabstat, invoked during VACUUM startup, had runtime
 proportional to the number of stats table entries times the number
 of pg_class rows; in other words O(N^2) if the stats collector's
 information is reasonably complete.  Replace list searching with a
 hash table to bring it back to O(N) behavior.  Per report from kim
 at myemma.com.  Back-patch as far as 8.1; 8.0 and before use
 different coding here.

Thanks,
Kim


Tom Lane wrote:


I wrote:
 


What I think we need to do about this is
(1) fix pgstat_vacuum_tabstats to have non-O(N^2) behavior; I'm thinking
of using a hash table for the OIDs instead of a linear list.  Should be
a pretty small change; I'll work on it today.
   



I've applied the attached patch to 8.2 to do the above.  Please give it
a try and see how much it helps for you.  Some limited testing here
confirms a noticeable improvement in VACUUM startup time at 1
tables, and of course it should be 100X worse with 10 tables.

I am still confused why you didn't see the problem in 8.1, though.
This code is just about exactly the same in 8.1.  Maybe you changed
your stats collector settings when moving to 8.2?

regards, tom lane

 




Index: pgstat.c
===
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.140
diff -c -r1.140 pgstat.c
*** pgstat.c21 Nov 2006 20:59:52 -  1.140
--- pgstat.c11 Jan 2007 22:32:30 -
***
*** 159,164 
--- 159,165 
 static void pgstat_read_statsfile(HTAB **dbhash, Oid onlydb);
 static void backend_read_statsfile(void);
 static void pgstat_read_current_status(void);
+ static HTAB *pgstat_collect_oids(Oid catalogid);
 
 static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);

 static void pgstat_send(void *msg, int len);
***
*** 657,666 
 void
 pgstat_vacuum_tabstat(void)
 {
!   List   *oidlist;
!   Relationrel;
!   HeapScanDesc scan;
!   HeapTuple   tup;
PgStat_MsgTabpurge msg;
HASH_SEQ_STATUS hstat;
PgStat_StatDBEntry *dbentry;
--- 658,664 
 void
 pgstat_vacuum_tabstat(void)
 {
!   HTAB   *htab;
PgStat_MsgTabpurge msg;
HASH_SEQ_STATUS hstat;
PgStat_StatDBEntry *dbentry;
***
*** 679,693 
/*
 * Read pg_database and make a list of OIDs of all existing databases
 */
!   oidlist = NIL;
!   rel = heap_open(DatabaseRelationId, AccessShareLock);
!   scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
!   while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
!   {
!   oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup));
!   }
!   heap_endscan(scan);
!   heap_close(rel, AccessShareLock);
 
 	/*

 * Search the database hash table for dead databases and tell the
--- 677,683 
/*
 * Read pg_database and make a list of OIDs of all existing databases
 */
!   htab = pgstat_collect_oids(DatabaseRelationId);
 
 	/*

 * Search the database hash table for dead databases and tell the
***
*** 698,709 
{
Oid dbid = dbentry->databaseid;
 
! 		if (!list_member_oid(oidlist, dbid))

pgstat_drop_database(dbid);
}
 
 	/* Clean up */

!   list_free(oidlist);
 
 	/*

 * Lookup our own database entry; if not found, nothing more to do.
--- 688,701 
{
Oid dbid = dbentry->databaseid;
 
! 		CHECK_FOR_INTERRUPTS();
! 
! 		if (hash_search(htab, (void *) &dbid, HASH_FIND, NULL) == NULL)

pgstat_drop_database(dbid);
}
 
 	/* Clean up */

!   hash_destroy(htab);
 
 	/*

 * Lookup our own database entry; if not found, nothing more to do.
***
*** 717,731 
/*
 * Similarly to above, make a list of all known relations in this DB.
 */
!   oidlist = NIL;
!   rel = heap_open(RelationRelationId, AccessShareLock);
!   scan = heap_beginscan(rel, SnapshotNow, 0, NULL);
!   while ((tup = heap_getnext(scan, ForwardScanDirection)) != NULL)
!   {
!   oidlist = lappend_oid(oidlist, HeapTupleGetOid(tup));

[PERFORM] full outer performance problem

2005-05-10 Thread Kim Bisgaard




Hi,

I'm having problems with the query optimizer and FULL OUTER JOIN on
PostgreSQL 7.4. I cannot get it to use my indexes with full outer
joins. I might be naive, but I think that it should be possible?

I have two BIG tables (virtually identical) with 3 NOT NULL columns
Station_id, TimeObs, Temp_, with indexes on (Station_id, TimeObs)
and valid ANALYSE (set statistics=100). I want to join the two tables
with a FULL OUTER JOIN.

When I specify the query as:
select temp_max_60min,temp_dry_at_2m
from station s natural join
temp_dry_at_2m a full outer join temp_max_60min b using (station_id, timeobs)
where s.wmo_id=6065 
and timeobs='2004-1-1 0:0:0'
and '2004-1-1 0:0:0' between s.startdate and s.enddate;
I get the correct results, BUT LOUSY performance, and the following
explain:
 Nested Loop Left Join  (cost=5.84..163484.08 rows=1349 width=12) (actual time=66146.815..119005.381 rows=1 loops=1)
   Filter: (COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 00:00:00'::timestamp without time zone)
   ->  Hash Join  (cost=5.84..155420.24 rows=1349 width=16) (actual time=8644.449..110836.038 rows=109826 loops=1)
 Hash Cond: ("outer".station_id = "inner".station_id)
 ->  Seq Scan on temp_dry_at_2m a  (cost=0.00..120615.94 rows=6956994 width=16) (actual time=0.024..104548.515 rows=6956994 loops=1)
 ->  Hash  (cost=5.84..5.84 rows=1 width=4) (actual time=0.114..0.114 rows=0 loops=1)
   ->  Index Scan using wmo_idx on station  (cost=0.00..5.84 rows=1 width=4) (actual time=0.105..0.108 rows=1 loops=1)
 Index Cond: ((wmo_id = 6065) AND ('2004-01-01 00:00:00'::timestamp without time zone >= startdate) AND ('2004-01-01 00:00:00'::timestamp without time zone <= enddate))
   ->  Index Scan using temp_max_60min_idx on temp_max_60min b  (cost=0.00..5.96 rows=1 width=20) (actual time=0.071..0.071 rows=0 loops=109826)
 Index Cond: (("outer".station_id = b.station_id) AND ("outer".timeobs = b.timeobs))
 Total runtime: 119005.499 ms
(11 rows)

If I change the query to (and thus negates the full outer join):
select temp_max_60min,temp_dry_at_2m
from station s natural join
temp_dry_at_2m a full outer join temp_max_60min b using (station_id, timeobs)
where s.wmo_id=6065 
and a.timeobs='2004-1-1 0:0:0' and b.timeobs='2004-1-1 0:0:0' 
and '2004-1-1 0:0:0' between s.startdate and s.enddate;


I get wrong results (In the case where one of the records is missing in
one of the tables), BUT GOOD performance, and this query plan:
 Nested Loop  (cost=0.00..17.83 rows=1 width=12) (actual time=79.221..79.236 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..11.82 rows=1 width=24) (actual time=65.517..65.526 rows=1 loops=1)
 ->  Index Scan using wmo_idx on station  (cost=0.00..5.83 rows=1 width=4) (actual time=0.022..0.026 rows=1 loops=1)
   Index Cond: ((wmo_id = 6065) AND ('2004-01-01 00:00:00'::timestamp without time zone >= startdate) AND ('2004-01-01 00:00:00'::timestamp without time zone <= enddate))
 ->  Index Scan using temp_max_60min_idx on temp_max_60min b  (cost=0.00..5.97 rows=1 width=20) (actual time=65.483..65.486 rows=1 loops=1)
   Index Cond: (("outer".station_id = b.station_id) AND (b.timeobs = '2004-01-01 00:00:00'::timestamp without time zone))
   ->  Index Scan using temp_dry_at_2m_idx on temp_dry_at_2m a  (cost=0.00..6.00 rows=1 width=16) (actual time=13.694..13.698 rows=1 loops=1)
 Index Cond: (("outer".station_id = a.station_id) AND (a.timeobs = '2004-01-01 00:00:00'::timestamp without time zone))
 Total runtime: 79.340 ms
(9 rows)


If further info like EXPLAIN VERBOSE is useful please say so and I will
provide it.

Thanks in advance!
Kim Bisgaard.





Re: [PERFORM] full outer performance problem

2005-05-15 Thread Kim Bisgaard
Hi,
Look for my comments further down...
John A Meinel wrote:
Kim Bisgaard wrote:
Hi,
I'm having problems with the query optimizer and FULL OUTER JOIN on
PostgreSQL 7.4. I cannot get it to use my indexes with full outer joins.
I might be naive, but I think that it should be possible?
I have two BIG tables (virtually identical) with 3 NOT NULL columns
Station_id, TimeObs, Temp_, with indexes on (Station_id, TimeObs)
and valid ANALYSE (set statistics=100). I want to join the two tables
with a FULL OUTER JOIN.
When I specify the query as:
select temp_max_60min,temp_dry_at_2m
from station s natural join
temp_dry_at_2m a full outer join temp_max_60min b using (station_id, 
timeobs)
where s.wmo_id=6065
and timeobs='2004-1-1 0:0:0'
and '2004-1-1 0:0:0' between s.startdate and s.enddate;

I get the correct results, BUT LOUSY performance, and the following 
explain:

 Nested Loop Left Join  (cost=5.84..163484.08 rows=1349 width=12) 
(actual time=66146.815..119005.381 rows=1 loops=1)
   Filter: (COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 
00:00:00'::timestamp without time zone)
   ->  Hash Join  (cost=5.84..155420.24 rows=1349 width=16) (actual 
time=8644.449..110836.038 rows=109826 loops=1)

Well, the estimate here is quite a bit off. It thinks you will be
getting 1349 (which is probably why it picked a nested loop plan), but
then it is getting 109826 rows.
I'm guessing it is misunderstanding the selectivity of the timeobs 
column.
I think you are right..

 Hash Cond: ("outer".station_id = "inner".station_id)
 ->  Seq Scan on temp_dry_at_2m a  (cost=0.00..120615.94 
rows=6956994 width=16) (actual time=0.024..104548.515 rows=6956994 
loops=1)
 ->  Hash  (cost=5.84..5.84 rows=1 width=4) (actual 
time=0.114..0.114 rows=0 loops=1)
   ->  Index Scan using wmo_idx on station  
(cost=0.00..5.84 rows=1 width=4) (actual time=0.105..0.108 rows=1 
loops=1)
 Index Cond: ((wmo_id = 6065) AND ('2004-01-01 
00:00:00'::timestamp without time zone >= startdate) AND ('2004-01-01 
00:00:00'::timestamp without time zone <= enddate))
   ->  Index Scan using temp_max_60min_idx on temp_max_60min b  
(cost=0.00..5.96 rows=1 width=20) (actual time=0.071..0.071 rows=0 
loops=109826)
 Index Cond: (("outer".station_id = b.station_id) AND 
("outer".timeobs = b.timeobs))
 Total runtime: 119005.499 ms
(11 rows)

I think the bigger problem is that a full outer join says grab all rows,
even if they are null.
What about this query:
SELECT temp_max_60min,temp_dry_at_2m
  FROM (station s LEFT JOIN temp_dry_at_2m a USING (station_id, timeobs)
   LEFT JOIN temp_max_60min b USING (station_id, timeobs)
where s.wmo_id=6065
and timeobs='2004-1-1 0:0:0'
and '2004-1-1 0:0:0' between s.startdate and s.enddate;
This works very well, and gives the correct result - thanks!!
After that, you should probably have a multi-column index on
(station_id, timeobs), which lets postgres use just that index for the
lookup, rather than using an index and then a filter. (Looking at your
next query you might already have that index).
Yes I have.

If I change the query to (and thus negates the full outer join):

This is the same query, I think you messed up your copy and paste.
Nope. Changed "and timeobs='2004-1-1 0:0:0' " to "and 
a.timeobs='2004-1-1 0:0:0' and b.timeobs='2004-1-1 0:0:0' "


select temp_max_60min,temp_dry_at_2m
from station s natural join
temp_dry_at_2m a full outer join temp_max_60min b using (station_id, 
timeobs)
where s.wmo_id=6065
and _a.timeobs='2004-1-1 0:0:0' and b._timeobs='2004-1-1 0:0:0'
and '2004-1-1 0:0:0' between s.startdate and s.enddate;

I get wrong results (In the case where one of the records is missing in
one of the tables), BUT GOOD performance, and this query plan:
 Nested Loop  (cost=0.00..17.83 rows=1 width=12) (actual 
time=79.221..79.236 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..11.82 rows=1 width=24) (actual 
time=65.517..65.526 rows=1 loops=1)
 ->  Index Scan using wmo_idx on station  (cost=0.00..5.83 
rows=1 width=4) (actual time=0.022..0.026 rows=1 loops=1)
   Index Cond: ((wmo_id = 6065) AND ('2004-01-01 
00:00:00'::timestamp without time zone >= startdate) AND ('2004-01-01 
00:00:00'::timestamp without time zone <= enddate))
 ->  Index Scan using temp_max_60min_idx on temp_max_60min b  
(cost=0.00..5.97 rows=1 width=20) (actual time=65.483..65.486 rows=1 
loops=1)
   Index Cond: (("outer".station_id = b.station_id) AND 
(b.timeobs = '2004-01-01 00:00:00'::timestamp without time zone))
   ->  Index Scan using temp_dry_at_2m_idx on temp_dry_at_2m a  
(cost=0.00..6.00 rows=1 width=16) (actual time=13.6

Re: [PERFORM] full outer performance problem

2005-05-15 Thread Kim Bisgaard




Sorry for not listing the exact layout of temp_:
obsdb=> \d temp_dry_at_2m
  Table "public.temp_dry_at_2m"
 Column |    Type | Modifiers
+-+---
 obshist_id | integer | not null
 station_id | integer | not null
 timeobs    | timestamp without time zone | not null
 temp_dry_at_2m | real    | not null
Indexes:
    "temp_dry_at_2m_pkey" primary key, btree (obshist_id)
    "temp_dry_at_2m_idx" btree (station_id, timeobs)

The difference between the two queries is if a (station_id,timeobs) row
is missing in one table, then the first returns one record(null,9.3)
while the second return no records.

Regards,
Kim Bisgaard.

Tom Lane wrote:

  Kim Bisgaard <[EMAIL PROTECTED]> writes:
  
  
I have two BIG tables (virtually identical) with 3 NOT NULL columns 
Station_id, TimeObs, Temp_, with indexes on (Station_id, TimeObs) 
and valid ANALYSE (set statistics=100). I want to join the two tables 
with a FULL OUTER JOIN.

  
  
I'm confused.  If the columns are NOT NULL, why isn't this a valid
transformation of your original query?

  
  
select temp_max_60min,temp_dry_at_2m
from station s natural join
temp_dry_at_2m a full outer join temp_max_60min b using (station_id, timeobs)
where s.wmo_id=6065 
and _a.timeobs='2004-1-1 0:0:0' and b._timeobs='2004-1-1 0:0:0' 
and '2004-1-1 0:0:0' between s.startdate and s.enddate;

  
  
Seems like it's not eliminating any rows that would otherwise succeed.

			regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

  


-- 
Kim Bisgaard

Computer Department  Phone: +45 3915 7562 (direct)
Danish Meteorological Institute  Fax: +45 3915 7460 (division)





[PERFORM] full outer performance problem

2005-06-08 Thread Kim Bisgaard

Hi,

I'm having problems with the query optimizer and FULL OUTER JOIN on 
PostgreSQL 7.4. I cannot get it to use my indexes with full outer joins. 
I might be naive, but I think that it should be possible?


I have two BIG tables (virtually identical) with 3 NOT NULL columns 
Station_id, TimeObs, Temp_, with unique indexes on (Station_id, 
TimeObs) and valid ANALYSE (set statistics=100). I want to join the two 
tables with a FULL OUTER JOIN.


When I specify the query as:

SELECT station_id, timeobs,temp_grass, temp_dry_at_2m
   FROM temp_dry_at_2m a
   FULL OUTER JOIN temp_grass b 
   USING (station_id, timeobs)

   WHERE station_id = 52981
 AND timeobs = '2004-1-1 0:0:0'

I get the correct results

station_id |   timeobs   | temp_grass | temp_dry_at_2m
+-++
 52944 | 2004-01-01 00:10:00 ||   -1.1
(1 row)

BUT LOUSY performance, and the following EXPLAIN:


 QUERY PLAN
--
Merge Full Join  (cost=1542369.83..1618958.58 rows=6956994 width=32) (actual 
time=187176.408..201436.264 rows=1 loops=1)
  Merge Cond: (("outer".station_id = "inner".station_id) AND ("outer".timeobs = 
"inner".timeobs))
  Filter: ((COALESCE("outer".station_id, "inner".station_id) = 52981) AND 
(COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 00:00:00'::timestamp without time zone))
  ->  Sort  (cost=1207913.44..1225305.93 rows=6956994 width=16) (actual 
time=145748.253..153851.607 rows=6956994 loops=1)
Sort Key: a.station_id, a.timeobs
->  Seq Scan on temp_dry_at_2m a  (cost=0.00..117549.94 rows=6956994 
width=16) (actual time=0.049..54226.770 rows=6956994 loops=1)
  ->  Sort  (cost=334456.38..340472.11 rows=2406292 width=16) (actual 
time=31668.876..34491.123 rows=2406292 loops=1)
Sort Key: b.station_id, b.timeobs
->  Seq Scan on temp_grass b  (cost=0.00..40658.92 rows=2406292 
width=16) (actual time=0.052..5484.489 rows=2406292 loops=1)
Total runtime: 201795.989 ms
(10 rows)

If I change the query (note the "b."s)

explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m
   FROM temp_dry_at_2m a
   FULL OUTER JOIN temp_grass b
   USING (station_id, timeobs)
   WHERE b.station_id = 52981
 AND b.timeobs = '2004-1-1 0:0:0'

I seem to destroy the FULL OUTER JOIN and get wrong results (nothing)
If I had happend to use "a.", and not "b.", I would have gotten correct 
results (by accident).

The "a." variant gives this EXPLAIN:

QUERY PLAN

Nested Loop Left Join  (cost=0.00..11.97 rows=1 width=20) (actual 
time=0.060..0.067 rows=1 loops=1)
  ->  Index Scan using temp_dry_at_2m_idx on temp_dry_at_2m a  (cost=0.00..5.99 
rows=1 width=16) (actual time=0.033..0.036 rows=1 loops=1)
Index Cond: ((station_id = 52981) AND (timeobs = '2004-01-01 
00:00:00'::timestamp without time zone))
  ->  Index Scan using temp_grass_idx on temp_grass b  (cost=0.00..5.96 rows=1 
width=16) (actual time=0.018..0.021 rows=1 loops=1)
Index Cond: (("outer".station_id = b.station_id) AND ("outer".timeobs = 
b.timeobs))
Total runtime: 0.140 ms
(6 rows)

Why will PostgreSQL not use the same plan for both these queries - they 
are virtually identical??


I have tried to formulate the problem with left joins, but this demands 
from me that I know which table has all the values (and thus has to go 
first), and in practice no such table excists.


TIA,
Kim Bisgaard.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] full outer performance problem

2005-06-08 Thread Kim Bisgaard

Hi Bruno,

Thanks for the moral support! I feel so too - but I am confident it will 
show up soon.


W.r.t. your rewrite of the query, I get this "ERROR:  could not devise a 
query plan for the given query" but no further details - I will try google


Regards,
Kim.

Bruno Wolff III wrote:


On Wed, Jun 08, 2005 at 11:37:40 +0200,
 Kim Bisgaard <[EMAIL PROTECTED]> wrote:
 


Hi,

I'm having problems with the query optimizer and FULL OUTER JOIN on 
PostgreSQL 7.4. I cannot get it to use my indexes with full outer joins. 
I might be naive, but I think that it should be possible?


I have two BIG tables (virtually identical) with 3 NOT NULL columns 
Station_id, TimeObs, Temp_, with unique indexes on (Station_id, 
TimeObs) and valid ANALYSE (set statistics=100). I want to join the two 
tables with a FULL OUTER JOIN.


When I specify the query as:

SELECT station_id, timeobs,temp_grass, temp_dry_at_2m
  FROM temp_dry_at_2m a
  FULL OUTER JOIN temp_grass b 
  USING (station_id, timeobs)

  WHERE station_id = 52981
AND timeobs = '2004-1-1 0:0:0'

I get the correct results

station_id |   timeobs   | temp_grass | temp_dry_at_2m
+-++
52944 | 2004-01-01 00:10:00 ||   -1.1
(1 row)

BUT LOUSY performance, and the following EXPLAIN:


QUERY PLAN
--
Merge Full Join  (cost=1542369.83..1618958.58 rows=6956994 width=32) 
(actual time=187176.408..201436.264 rows=1 loops=1)
 Merge Cond: (("outer".station_id = "inner".station_id) AND 
 ("outer".timeobs = "inner".timeobs))
 Filter: ((COALESCE("outer".station_id, "inner".station_id) = 52981) AND 
 (COALESCE("outer".timeobs, "inner".timeobs) = '2004-01-01 
 00:00:00'::timestamp without time zone))
 ->  Sort  (cost=1207913.44..1225305.93 rows=6956994 width=16) (actual 
 time=145748.253..153851.607 rows=6956994 loops=1)

   Sort Key: a.station_id, a.timeobs
   ->  Seq Scan on temp_dry_at_2m a  (cost=0.00..117549.94 
   rows=6956994 width=16) (actual time=0.049..54226.770 rows=6956994 
   loops=1)
 ->  Sort  (cost=334456.38..340472.11 rows=2406292 width=16) (actual 
 time=31668.876..34491.123 rows=2406292 loops=1)

   Sort Key: b.station_id, b.timeobs
   ->  Seq Scan on temp_grass b  (cost=0.00..40658.92 rows=2406292 
   width=16) (actual time=0.052..5484.489 rows=2406292 loops=1)

Total runtime: 201795.989 ms
(10 rows)
   



Someone else will need to comment on why Postgres can't use a more
efficient plan. What I think will work for you is to restrict
the station_id and timeobs on each side and then do a full join.
You can try something like the sample query below (which hasn't been tested):
SELECT station_id, timeobs, temp_grass, temp_dry_at_2m
 FROM
   (SELECT station_id, timeobs, temp_dry_at_2m
 FROM temp_dry_at_2m
 WHERE
   station_id = 52981
   AND
   timeobs = '2004-1-1 0:0:0') a
   FULL OUTER JOIN
   (SELECT station_id, timeobs, temp_grass
 FROM temp_grass
 WHERE
   station_id = 52981
   AND
   timeobs = '2004-1-1 0:0:0') b
   USING (station_id, timeobs)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

 



---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] full outer performance problem

2005-06-08 Thread Kim Bisgaard
Quoting Tom Lane <[EMAIL PROTECTED]>:

> Kim Bisgaard <[EMAIL PROTECTED]> writes:
> > SELECT station_id, timeobs,temp_grass, temp_dry_at_2m
> > FROM temp_dry_at_2m a
> > FULL OUTER JOIN temp_grass b
> > USING (station_id, timeobs)
> > WHERE station_id = 52981
> >   AND timeobs = '2004-1-1 0:0:0'
>
> > explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m
> > FROM temp_dry_at_2m a
> > FULL OUTER JOIN temp_grass b
> > USING (station_id, timeobs)
> > WHERE b.station_id = 52981
> >   AND b.timeobs = '2004-1-1 0:0:0'
>
> > Why will PostgreSQL not use the same plan for both these queries - they
> > are virtually identical??
>
> Because they're semantically completely different.  The second query is
> effectively a RIGHT JOIN, because join rows in which b is all-null will
> be thrown away by the WHERE.  The optimizer sees this (note your second
> plan doesn't use a Full Join step anywhere) and is able to produce a
> much better plan.  Full outer join is difficult to optimize, in part
> because we have no choice but to use a merge join for it --- the other
> join types don't support full join.
>
>   regards, tom lane
>


Yes I am aware that they are not "identical", they also give different results,
but the data nessesary to compute the results is (0-2 rows, 0-1 row from each
table), and thus ideally have the potential to have similar performance - to my
head anyway, but I may not have grasped the complete picture yet :-)

Regards,
Kim.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] full outer performance problem

2005-06-08 Thread Kim Bisgaard
Quoting Tom Lane <[EMAIL PROTECTED]>:

> Kim Bisgaard <[EMAIL PROTECTED]> writes:
> > W.r.t. your rewrite of the query, I get this "ERROR:  could not devise a
> > query plan for the given query" but no further details - I will try google
>
> Which PG version are you using again?  That should be fixed in 7.4.3
> and later.
>
>   regards, tom lane
>

Its 7.4.1. I am in the process (may take a while yet) of installing 8.0.3 on the
same hardware in order to have a parallel system. Time is a finite meassure :-)

I must admit I would rather have the first query perform, that have this
workaround function ;-)

Regards,
Kim.

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Is any effect other process performance after vaccumdb finished ?

2011-05-26 Thread Junghwe Kim
Hi.

First extremely thanks for your works about postgresql .

I wonder that after executing 'vaccumdb  -z'  some other process can not
read their own msg queue during 2 ~ 3 minuts.

vaccum executed every hour. and The processes have not any relations between
postgreql.

Is it possible ?


[PERFORM] Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster

2012-04-05 Thread Kim Hansen
Hi All

I have a query where the planner makes a wrong cost estimate, it looks
like it underestimates the cost of a "Bitmap Heap Scan" compared to an
"Index Scan".

This it the two plans, I have also pasted them below:
  Slow (189ms): http://explain.depesz.com/s/2Wq
  Fast (21ms): http://explain.depesz.com/s/ThQ

I have run "VACUUM FULL VERBOSE ANALYZE". I have configured
shared_buffers and effective_cache_size, that didn't solve my problem,
the estimates was kept the same and both queries got faster.

What can I do to fix the cost estimate?

Regards,
Kim Hansen




yield=> SELECT version();
version
---
 PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by
gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit
(1 row)

yield=> explain analyze select "filtered_demands"."pol" as "c0" from
"demands"."filtered_demands" as "filtered_demands" where
("filtered_demands"."pod" = 'VELAG') group by "filtered_demands"."pol"
order by "filtered_demands"."pol" ASC NULLS LAST;

 QUERY PLAN
---
 Sort  (cost=38564.80..38564.80 rows=2 width=6) (actual
time=188.987..189.003 rows=221 loops=1)
   Sort Key: pol
   Sort Method: quicksort  Memory: 35kB
   ->  HashAggregate  (cost=38564.77..38564.79 rows=2 width=6) (actual
time=188.796..188.835 rows=221 loops=1)
 ->  Bitmap Heap Scan on filtered_demands
(cost=566.23..38503.77 rows=24401 width=6) (actual time=6.501..182.634
rows=18588 loops=1)
   Recheck Cond: (pod = 'VELAG'::text)
   ->  Bitmap Index Scan on filtered_demands_pod_pol_idx
(cost=0.00..560.12 rows=24401 width=0) (actual time=4.917..4.917
rows=18588 loops=1)
 Index Cond: (pod = 'VELAG'::text)
 Total runtime: 189.065 ms
(9 rows)

yield=> set enable_bitmapscan = false;
SET
yield=> explain analyze select "filtered_demands"."pol" as "c0" from
"demands"."filtered_demands" as "filtered_demands" where
("filtered_demands"."pod" = 'VELAG') group by "filtered_demands"."pol"
order by "filtered_demands"."pol" ASC NULLS LAST;

 QUERY PLAN
--
 Group  (cost=0.00..76534.33 rows=2 width=6) (actual
time=0.028..20.823 rows=221 loops=1)
   ->  Index Scan using filtered_demands_pod_pol_idx on
filtered_demands  (cost=0.00..76473.33 rows=24401 width=6) (actual
time=0.027..17.174 rows=18588 loops=1)
 Index Cond: (pod = 'VELAG'::text)
 Total runtime: 20.877 ms
(4 rows)

yield=>

-- 
Kim Rydhof Thor Hansen
Vadgårdsvej 3, 2. tv.
2860 Søborg
Phone: +45 3091 2437

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster

2012-04-06 Thread Kim Hansen
On Thu, Apr 5, 2012 at 17:34, Kevin Grittner
 wrote:
> Kim Hansen  wrote:
>
>> I have a query where the planner makes a wrong cost estimate, it
>> looks like it underestimates the cost of a "Bitmap Heap Scan"
>> compared to an "Index Scan".
>
>> What can I do to fix the cost estimate?
>
> Could you try running the query with cpu_tuple_cost = 0.05 and let
> us know how that goes?
>

It looks like it just increased the estimated cost of both queries by
about 1000.

Regards,
Kim


===

yield=> explain analyze select "filtered_demands"."pol" as "c0" from
"demands"."filtered_demands" as "filtered_demands" where
("filtered_demands"."pod" = 'VELAG') group by "filtered_demands"."pol"
order by "filtered_demands"."pol" ASC NULLS LAST;

 QUERY PLAN
---
 Sort  (cost=39540.92..39540.92 rows=2 width=6) (actual
time=186.833..186.858 rows=221 loops=1)
   Sort Key: pol
   Sort Method: quicksort  Memory: 35kB
   ->  HashAggregate  (cost=39540.81..39540.91 rows=2 width=6) (actual
time=186.643..186.678 rows=221 loops=1)
 ->  Bitmap Heap Scan on filtered_demands
(cost=566.23..39479.81 rows=24401 width=6) (actual time=6.154..180.654
rows=18588 loops=1)
   Recheck Cond: (pod = 'VELAG'::text)
   ->  Bitmap Index Scan on filtered_demands_pod_pol_idx
(cost=0.00..560.12 rows=24401 width=0) (actual time=4.699..4.699
rows=18588 loops=1)
 Index Cond: (pod = 'VELAG'::text)
 Total runtime: 186.912 ms
(9 rows)

yield=> set enable_bitmapscan = false;
SET
yield=> explain analyze select "filtered_demands"."pol" as "c0" from
"demands"."filtered_demands" as "filtered_demands" where
("filtered_demands"."pod" = 'VELAG') group by "filtered_demands"."pol"
order by "filtered_demands"."pol" ASC NULLS LAST;

 QUERY PLAN
------
 Group  (cost=0.00..77510.37 rows=2 width=6) (actual
time=0.029..20.361 rows=221 loops=1)
   ->  Index Scan using filtered_demands_pod_pol_idx on
filtered_demands  (cost=0.00..77449.37 rows=24401 width=6) (actual
time=0.027..16.859 rows=18588 loops=1)
 Index Cond: (pod = 'VELAG'::text)
 Total runtime: 20.410 ms
(4 rows)

yield=>


-- 
Kim Rydhof Thor Hansen
Vadgårdsvej 3, 2. tv.
2860 Søborg
Phone: +45 3091 2437

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster

2012-04-07 Thread Kim Hansen
Hi all

On Fri, Apr 6, 2012 at 19:11, Jeff Janes  wrote:
> On Wed, Apr 4, 2012 at 6:47 AM, Kim Hansen  wrote:
>> Hi All
>>
>> I have a query where the planner makes a wrong cost estimate, it looks
>> like it underestimates the cost of a "Bitmap Heap Scan" compared to an
>> "Index Scan".
>>
>> This it the two plans, I have also pasted them below:
>>  Slow (189ms): http://explain.depesz.com/s/2Wq
>>  Fast (21ms): http://explain.depesz.com/s/ThQ
>
> Could you do explain (analyze, buffers)?

I have done that now, the log is pasted in below. It looks like every
buffer fetched is a hit, I would think that PostgreSQL should know
that as almost nothing happens on the server and effective_cache_size
is configured to 8GB.

> Did you run these queries multiple times in both orders?  If you just
> ran them once each, in the order indicated, then the bitmap scan may
> have done the hard work of reading all the needed buffers into cache,
> and the index scan then got to enjoy that cache.

I have run the queries a few times in order to warm up the caches, the
queries stabilise on 20ms and 180ms.

Regards,
Kim



yield=> explain (analyze,buffers) select "filtered_demands"."pol" as
"c0" from "demands"."filtered_demands" as "filtered_demands" where
("filtered_demands"."pod" = 'VELAG') group by "filtered_demands"."pol"
order by "filtered_demands"."pol" ASC NULLS LAST;

 QUERY PLAN
---
 Sort  (cost=38564.80..38564.80 rows=2 width=6) (actual
time=185.497..185.520 rows=221 loops=1)
   Sort Key: pol
   Sort Method: quicksort  Memory: 35kB
   Buffers: shared hit=14969
   ->  HashAggregate  (cost=38564.77..38564.79 rows=2 width=6) (actual
time=185.303..185.343 rows=221 loops=1)
 Buffers: shared hit=14969
 ->  Bitmap Heap Scan on filtered_demands
(cost=566.23..38503.77 rows=24401 width=6) (actual time=6.119..179.056
rows=18588 loops=1)
   Recheck Cond: (pod = 'VELAG'::text)
   Buffers: shared hit=14969
   ->  Bitmap Index Scan on filtered_demands_pod_pol_idx
(cost=0.00..560.12 rows=24401 width=0) (actual time=4.661..4.661
rows=18588 loops=1)
 Index Cond: (pod = 'VELAG'::text)
 Buffers: shared hit=74
 Total runtime: 185.577 ms
(13 rows)

yield=> set enable_bitmapscan = false;
SET
yield=> explain (analyze,buffers) select "filtered_demands"."pol" as
"c0" from "demands"."filtered_demands" as "filtered_demands" where
("filtered_demands"."pod" = 'VELAG') group by "filtered_demands"."pol"
order by "filtered_demands"."pol" ASC NULLS LAST;

 QUERY PLAN
--
 Group  (cost=0.00..76534.33 rows=2 width=6) (actual
time=0.029..20.202 rows=221 loops=1)
   Buffers: shared hit=18386
   ->  Index Scan using filtered_demands_pod_pol_idx on
filtered_demands  (cost=0.00..76473.33 rows=24401 width=6) (actual
time=0.027..16.455 rows=18588 loops=1)
 Index Cond: (pod = 'VELAG'::text)
 Buffers: shared hit=18386
 Total runtime: 20.246 ms
(6 rows)



-- 
Kim Rydhof Thor Hansen
Vadgårdsvej 3, 2. tv.
2860 Søborg
Phone: +45 3091 2437

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster

2012-04-13 Thread Kim Hansen
On Tue, Apr 10, 2012 at 04:59, Jeff Janes  wrote:
> On Fri, Apr 6, 2012 at 3:09 PM, Kim Hansen  wrote:
>
>> I have run the queries a few times in order to warm up the caches, the
>> queries stabilise on 20ms and 180ms.
>
> My first curiosity is not why the estimate is too good for Bitmap
> Index Scan, but rather why the actual execution is too poor.  As far
> as I can see the only explanation for the poor execution is that the
> bitmap scan has gone lossy, so that every tuple in every touched block
> needs to be rechecked against the where clause.  If that is the case,
> it suggests that your work_mem is quite small.
>
> In 9.2, explain analyze will report the number of tuples filtered out
> by rechecking, but that isn't reported in your version.
>
> It looks like the planner makes no attempt to predict when a bitmap
> scan will go lossy and then penalize it for the extra rechecks it will
> do.  Since it doesn't know it will be carrying out those extra checks,
> you can't just increase the tuple or operator costs factors.

You are right, when I increase the work_mem from 1MB to 2MB the time
decreases from 180ms to 30ms for the slow query. I have now configured
the server to 10MB work_mem.

> So that may explain why the bitmap is not getting penalized for its
> extra CPU time.  But that doesn't explain why the estimated cost is
> substantially lower than the index scan.  That is probably because the
> bitmap scan expects it is doing more sequential IO and less random IO.
>  You could cancel that advantage be setting random_page_cost to about
> the same as seq_page_cost (which since you indicated most data will be
> cached, would be an appropriate thing to do regardless of this
> specific issue).

I have set seq_page_cost and random_page_cost to 0.1 in order to
indicate that data is cached, the system now selects the faster index
scan.

Thanks for your help,
-- 
Kim Rydhof Thor Hansen
Vadgårdsvej 3, 2. tv.
2860 Søborg
Phone: +45 3091 2437

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] LIMIT on partitioned-table!?

2011-02-15 Thread Kim A. Brandt
0:00'::timestamp 
without time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone))
  ->  Bitmap Index Scan on 
idx_unitstat_y2010m12_nodeid_gps_ts  (cost=0.00..127.55 rows=4684 width=0) (actual 
time=225.009..225.009 rows=2014 loops=1)
Index Cond: ((nodeid)::text = 'abcd'::text)
->  Bitmap Heap Scan on unitstat_y2010m10 unitstat  
(cost=101.74..9686.81 rows=4987 width=194) (actual time=488.130..35826.742 
rows=25279 loops=1)
  Recheck Cond: ((nodeid)::text = 'abcd'::text)
  Filter: ((ts > '2010-01-01 00:00:00'::timestamp 
without time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone))
  ->  Bitmap Index Scan on 
idx_unitstat_y2010m10_nodeid_gps_ts  (cost=0.00..100.49 rows=4988 width=0) (actual 
time=472.796..472.796 rows=25279 loops=1)
Index Cond: ((nodeid)::text = 'abcd'::text)
->  Bitmap Heap Scan on unitstat_y2010m09 unitstat  
(cost=489.56..49567.74 rows=27466 width=194) (actual time=185.198..12753.315 
rows=31099 loops=1)
  Recheck Cond: ((nodeid)::text = 'abcd'::text)
  Filter: ((ts > '2010-01-01 00:00:00'::timestamp 
without time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone))
  ->  Bitmap Index Scan on 
idx_unitstat_y2010m09_nodeid_gps_ts  (cost=0.00..482.69 rows=27472 width=0) 
(actual time=158.072..158.072 rows=31099 loops=1)
Index Cond: ((nodeid)::text = 'abcd'::text)
->  Index Scan using 
fki_unitstat_y2010m08_nodeid_ts_fkey on unitstat_y2010m08 unitstat  
(cost=0.00..9353.76 rows=4824 width=194) (actual time=31.351..10259.090 rows=17606 
loops=1)
  Index Cond: (((nodeid)::text = 'abcd'::text) AND (ts 
> '2010-01-01 00:00:00'::timestamp without time zone) AND (ts < '2011-02-15 
15:00:00'::timestamp without time zone))
->  Index Scan using 
fki_unitstat_y2010m07_nodeid_ts_fkey on unitstat_y2010m07 unitstat  
(cost=0.00..8686.72 rows=4492 width=194) (actual time=41.572..9636.335 rows=9511 
loops=1)
  Index Cond: (((nodeid)::text = 'abcd'::text) AND (ts 
> '2010-01-01 00:00:00'::timestamp without time zone) AND (ts < '2011-02-15 
15:00:00'::timestamp without time zone))
->  Bitmap Heap Scan on unitstat_y2010m06 unitstat  
(cost=311.50..32142.18 rows=17406 width=194) (actual time=113.857..12136.570 
rows=17041 loops=1)
  Recheck Cond: ((nodeid)::text = 'abcd'::text)
  Filter: ((ts > '2010-01-01 00:00:00'::timestamp 
without time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone))
  ->  Bitmap Index Scan on 
idx_unitstat_y2010m06_nodeid_gps_ts  (cost=0.00..307.15 rows=17410 width=0) 
(actual time=91.638..91.638 rows=17041 loops=1)
Index Cond: ((nodeid)::text = 'abcd'::text)
->  Index Scan using 
fki_unitstat_y2010m05_nodeid_ts_fkey on unitstat_y2010m05 unitstat  
(cost=0.00..11942.82 rows=6279 width=193) (actual time=62.264..19887.675 
rows=19246 loops=1)
  Index Cond: (((nodeid)::text = 'abcd'::text) AND (ts 
> '2010-01-01 00:00:00'::timestamp without time zone) AND (ts < '2011-02-15 
15:00:00'::timestamp without time zone))
->  Index Scan using 
fki_unitstat_y2010m04_nodeid_ts_fkey on unitstat_y2010m04 unitstat  
(cost=0.00..11840.93 rows=6194 width=193) (actual time=52.735..17302.361 
rows=21936 loops=1)
  Index Cond: (((nodeid)::text = 'abcd'::text) AND (ts 
> '2010-01-01 00:00:00'::timestamp without time zone) AND (ts < '2011-02-15 
15:00:00'::timestamp without time zone))
->  Index Scan using 
fki_unitstat_y2010m03_nodeid_ts_fkey on unitstat_y2010m03 unitstat  
(cost=0.00..11664.36 rows=6101 width=194) (actual time=66.613..17541.374 
rows=15471 loops=1)
  Index Cond: (((nodeid)::text = 'abcd'::text) AND (ts 
> '2010-01-01 00:00:00'::timestamp without time zone) AND (ts < '2011-02-15 
15:00:00'::timestamp without time zone))
Total runtime: 205855.569 ms


Regards,

Kim

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] LIMIT on partitioned-table!?

2011-02-15 Thread Kim A. Brandt

Thank you Shaun,

removing the ORDER BY worked. But I am afraid to ask this. How can I order by 
partition? It seams that the planner has picked a random(!?) order of partition 
to select from. The returned records, from the selected partition, are 
correctly sorted bythe index though.

On 2011-02-15 15:49, Shaun Thomas wrote:

On 02/15/2011 08:23 AM, Kim A. Brandt wrote:


does `postgres (PostgreSQL) 8.4.5' use the LIMIT of a query when it
is run on a partitioned-table or am I doing something wrong? It looks
as if postgres queries all partitions and then LIMITing the records
afterwards!? This results in a long (>3 minutes) running query. What
can I do to optimise this?


Make sure you have constraint_exclusion set to 'on' in your config. Also, what 
are your checks for your partitions? You've got a pretty wide range in your 
'ts' checks, so if you're using them as your partition definition, you're not 
helping yourself.


The parameter `constraint_exclusion' is set to `partition'. Postgres is on 
FreeBSD.

My checks (if I understand you right) are as follows:

CREATE TABLE flexserver.unitstat_y2011m02
(
  ts timestamp without time zone NOT NULL,
  nodeid character varying(10) NOT NULL,
  gps_ts timestamp without time zone NOT NULL,
  ...
  CONSTRAINT unitstat_y2011m02_ts_check CHECK (ts >= '2011-02-01 
00:00:00'::timestamp without time zone AND ts < '2011-03-01 00:00:00'::timestamp 
without time zone)
)
INHERITS (flexserver.unitstat);

Each partition is constrained to one month.

About the wide range, I am aware of that. This probably has to change anyway!? 
So the current (and probably final solution) is to use a narrower search range. 
Thank you for the hint.


The main issue might just be that you've used an order clause. LIMIT 1000 or 
not, even if it can restrict the result set based on your CHECK criteria, it'll 
still need to select every matching row from every matched partition, order the 
results, and chop off the first 1000.


That was it. Just how can one order by partition if one would do a wide range 
search over multiple partitions?

The new query and EXPLAIN ANALYSE-output is:

SELECT *
FROM flexserver.unitstat
WHERE nodeid = 'abcd'
AND ts > '2010-01-01 00:00:00'
AND ts < '2011-02-15 15:00:00'
--ORDER BY nodeid, ts
LIMIT 1000;


Limit  (cost=0.00..1862.46 rows=1000 width=194) (actual time=2.569..18.948 
rows=1000 loops=1)
  ->  Result  (cost=0.00..225611.08 rows=121136 width=194) (actual 
time=2.566..15.412 rows=1000 loops=1)
->  Append  (cost=0.00..225611.08 rows=121136 width=194) (actual 
time=2.558..11.243 rows=1000 loops=1)
  ->  Seq Scan on unitstat  (cost=0.00..14.90 rows=1 width=258) 
(actual time=0.003..0.003 rows=0 loops=1)
Filter: ((ts > '2010-01-01 00:00:00'::timestamp without 
time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone) AND 
((nodeid)::text = 'abcd'::text))
  ->  Bitmap Heap Scan on unitstat_y2011m01 unitstat  
(cost=116.47..8097.17 rows=4189 width=194) (actual time=2.550..7.701 rows=1000 
loops=1)
Recheck Cond: ((nodeid)::text = 'abcd'::text)
Filter: ((ts > '2010-01-01 00:00:00'::timestamp without 
time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone))
->  Bitmap Index Scan on 
idx_unitstat_y2011m01_nodeid_gps_ts  (cost=0.00..115.42 rows=4190 width=0) (actual 
time=1.706..1.706 rows=5377 loops=1)
  Index Cond: ((nodeid)::text = 'abcd'::text)
  ->  Bitmap Heap Scan on unitstat_y2011m02 unitstat  
(cost=52.92..3744.97 rows=1934 width=194) (never executed)
Recheck Cond: ((nodeid)::text = 'abcd'::text)
Filter: ((ts > '2010-01-01 00:00:00'::timestamp without 
time zone) AND (ts < '2011-02-15 15:00:00'::timestamp without time zone))
->  Bitmap Index Scan on 
idx_unitstat_y2011m02_nodeid_gps_ts  (cost=0.00..52.44 rows=1935 width=0) (never 
executed)
  Index Cond: ((nodeid)::text = 'abcd'::text)
  ->  Index Scan using fki_unitstat_y2010m02_nodeid_ts_fkey on 
unitstat_y2010m02 unitstat  (cost=0.00..10179.11 rows=5257 width=193) (never 
executed)
Index Cond: (((nodeid)::text = 'abcd'::text) AND (ts > 
'2010-01-01 00:00:00'::timestamp without time zone) AND (ts < '2011-02-15 
15:00:00'::timestamp without time zone))
  ->  Index Scan using fki_unitstat_y2010m01_nodeid_ts_fkey on 
unitstat_y2010m01 unitstat  (cost=0.00..1032

Re: [PERFORM] LIMIT on partitioned-table!?

2011-02-15 Thread Kim A. Brandt

Thank you Marti,

I will go with the ``reduced number of matched rows'' and naturally be waiting 
for postgres 9.1 expectantly.


Kind regards,

Kim



On 2011-02-15 22:13, Marti Raudsepp wrote:

On Tue, Feb 15, 2011 at 21:33, Kim A. Brandt  wrote:

removing the ORDER BY worked. But I am afraid to ask this. How can I order
by partition? It seams that the planner has picked a random(!?) order of
partition to select from. The returned records, from the selected partition,
are correctly sorted bythe index though.


If a single query accesses more than one partition, PostgreSQL
currently cannot read the values in index-sorted order. Hence with
ORDER BY and LIMIT, PostgreSQL cannot return *any* results before it
has read all matching rows and then sorted them. Adding a LIMIT
doesn't help much. Your only bet is to reduce the number of matched
rows, or make sure that you only access a single partition.

Increasing work_mem may speed up the sort step if you're hitting the
disk (EXPLAIN ANALYZE VERBOSE will tell you whether that's the case).

This will change in PostgreSQL 9.1 which has a new Merge Append plan node.

Regards,
Marti


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance