Re: [PERFORM] How to use indexes for GROUP BY

2011-01-24 Thread Dimi Paun
On Mon, 2011-01-24 at 12:33 -0700, Scott Marlowe wrote: > As another poster observed, you're running an ancient version of pgsql > from a performance perspective. Upgrading to 8.4 or 9.0 would make a > huge difference in overall performance, not just with one or two > queries. Thanks for the tri

[PERFORM] Bloat issue on 8.3; autovac ignores HOT page splits?

2011-01-24 Thread Josh Berkus
Folks, I'm doing a postmortem on an 8.3 database which recently had to be rebuilt. The database was over 200% bloated ... 176GB as opposed to dump/reload size of 55GB. What I find really interesting is *which* tables were bloated. Consider these two tables, for example, which consist of one row

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Jeremy Palmer
Thanks heaps for the advice. I will do some benchmarks to see how long it takes to cluster all of the database tables. Cheers, Jeremy -Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: Tuesday, 25 January 2011 1:02 p.m. To: Jeremy Palmer; Tom Lane Cc: Rob

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Kevin Grittner
Jeremy Palmer wrote: > My only question is how often will I need to re-cluster the > table, because it comes at quite a cost. I probably should have mentioned that the CLUSTER will run faster if the data is already mostly in the right sequence. You'll be doing a nearly sequential pass over th

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Kevin Grittner
Jeremy Palmer wrote: > Kevin I've now clustered the table. And the performance did > increase quite a bit. Yeah, that's enough to notice the difference. > My only question is how often will I need to re-cluster the table, > because it comes at quite a cost. The setup I'm running will mean >

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Jeremy Palmer
Might be a chance on 9.0 in a couple of weeks, when I do an upgrade on one our dev boxes. Kevin I've now clustered the table. And the performance did increase quite a bit. My only question is how often will I need to re-cluster the table, because it comes at quite a cost. The setup I'm running

Re: [PERFORM] How to use indexes for GROUP BY

2011-01-24 Thread hubert depesz lubaczewski
On Mon, Jan 24, 2011 at 01:29:01PM -0500, Dimi Paun wrote: > Hi folks, > > I have a table like so: > > create table tagRecord ( > uid varchar(60) primary key, > [bunch of other fields] > location varchar(32), > creationTStimestamp > ); > create index idx_tagda

Re: [PERFORM] How to use indexes for GROUP BY

2011-01-24 Thread Scott Marlowe
On Mon, Jan 24, 2011 at 11:29 AM, Dimi Paun wrote: Two very quick points: > tts_server_db=# explain analyze select location, max(creationTS) from > tagrecord group by location; >                                                       QUERY PLAN > -

Re: [PERFORM] How to use indexes for GROUP BY

2011-01-24 Thread Shaun Thomas
On 01/24/2011 12:29 PM, Dimi Paun wrote: I want to simply get the latest "creationTS" for each location, but that seems to result in a full table scan: tts_server_db=# explain analyze select location, max(creationTS) from tagrecord group by location; Try this, it *might* work: select DISTINC

[PERFORM] How to use indexes for GROUP BY

2011-01-24 Thread Dimi Paun
Hi folks, I have a table like so: create table tagRecord ( uid varchar(60) primary key, [bunch of other fields] location varchar(32), creationTStimestamp ); create index idx_tagdata_loc_creationTS on tagRecord(location, creationTS); The number of individual v

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Kevin Grittner
Tom Lane wrote: > Um, that looks like 90 usec per row, not msec. Oh, right. Well, having to do a random heap access for 1% of the rows would pretty much explain the run time, then. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Tom Lane
"Kevin Grittner" writes: > Robert Haas wrote: >> But isn't 13.5 seconds awfully slow to scan 149557 rows? The sort >> is sorting 23960kB. Dividing that by 149557 rows gives ~169 >> bytes/per row > You're right. I would expect 9 ms as per tuple as a worst case if > it doesn't need to go to TO

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Tom Lane
Robert Haas writes: > But isn't 13.5 seconds awfully slow to scan 149557 rows? Depends on how many physical blocks they're scattered across, which is hard to tell from this printout. And on how many of the blocks are already in cache, and what sort of disk hardware he's got, etc. > Any chance t

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Kevin Grittner
Robert Haas wrote: > But isn't 13.5 seconds awfully slow to scan 149557 rows? The sort > is sorting 23960kB. Dividing that by 149557 rows gives ~169 > bytes/per row You're right. I would expect 9 ms as per tuple as a worst case if it doesn't need to go to TOAST data. Caching, multiple rows

Re: [PERFORM] Possible to improve query plan?

2011-01-24 Thread Robert Haas
On Mon, Jan 17, 2011 at 11:48 AM, Kevin Grittner wrote: > Jeremy Palmer  wrote: > >>   WHERE ( >>       (_revision_created <= 16 >>        AND _revision_expired > 16 >>        AND _revision_expired <= 40) >>    OR (_revision_created > 16 >>        AND _revision_created <= 40)) > >> -> Bitmap Heap

Re: [PERFORM] "NOT IN" substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2

2011-01-24 Thread Achilleas Mantzios
Στις Friday 21 January 2011 22:22:24 ο/η Tom Lane έγραψε: > "Kevin Grittner" writes: > > Tom Lane wrote: > >> If it should be less than 1, then what? > > > 1 - (estimated tuples / estimated distinct values) ? > > Uh, no. The number we're after is the probability that an outer tuple > has at l