Re: [PERFORM] Searching for the cause of a bad plan

2007-09-28 Thread Csaba Nagy
> Just an idea, but with the 8.3 concurrent scan support would it be > possible to hang a more in depth analyze over exisiting sequential > scans. Then it would be a lower cost to have higher resolution in > the statistics because the I/O component would be hidden. The biggest problem with that is

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-28 Thread Csaba Nagy
On Thu, 2007-09-27 at 11:07 -0700, Ron Mayer wrote: > Csaba Nagy wrote: > > > > Well, my problem was actually solved by rising the statistics target, > > Would it do more benefit than harm if postgres increased the > default_statistics_target? > > I see a fair number of people (myself included)

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Ron Mayer
Csaba Nagy wrote: > > Well, my problem was actually solved by rising the statistics target, Would it do more benefit than harm if postgres increased the default_statistics_target? I see a fair number of people (myself included) asking questions who's resolution was to ALTER TABLE SET STATISTICS;

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Simon Riggs
On Thu, 2007-09-27 at 10:40 -0400, Tom Lane wrote: > And yet there's another trap here: if the parameter you passed in > chanced to be one of the very common values, a plan that was optimized > for a small number of matches would perform terribly. I wonder could we move prepare_threshold onto the

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Csaba Nagy
On Thu, 2007-09-27 at 10:40 -0400, Tom Lane wrote: > And yet there's another trap here: if the parameter you passed in > chanced to be one of the very common values, a plan that was optimized > for a small number of matches would perform terribly. > > We've speculated about trying to deal with the

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Tom Lane
Csaba Nagy <[EMAIL PROTECTED]> writes: > On Wed, 2007-09-26 at 11:22 -0400, Tom Lane wrote: >> ... how >> many values of "a" are there really, and what's the true distribution of >> counts? > table_a has 23366 distinct values. Some statistics (using R): >> summary(table_a_histogram) >a

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-27 Thread Csaba Nagy
On Wed, 2007-09-26 at 11:22 -0400, Tom Lane wrote: > ... how > many values of "a" are there really, and what's the true distribution of > counts? table_a has 23366 distinct values. Some statistics (using R): > summary(table_a_histogram) a count Min. : 7857

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-26 Thread Tom Lane
Csaba Nagy <[EMAIL PROTECTED]> writes: > db=# analyze verbose temp_table_a; > INFO: analyzing "public.temp_table_a" > INFO: "temp_table_a": scanned 3000 of 655299 pages, containing 1887000 live > rows and 0 dead rows; 3000 rows in sample, 412183071 estimated total rows Hmm. So the bottom line

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-26 Thread Csaba Nagy
> Csaba, please can you copy that data into fresh tables, re-ANALYZE and > then re-post the EXPLAINs, with stats data. Here you go, fresh experiment attached. Cheers, Csaba. db=# \d temp_table_a Table "public.temp_table_a" Column | Type | Modifiers ++--- a |

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-24 Thread Simon Riggs
On Mon, 2007-09-24 at 16:04 +0200, Csaba Nagy wrote: > On Mon, 2007-09-24 at 14:27 +0100, Simon Riggs wrote: > > Csaba, please can you copy that data into fresh tables, re-ANALYZE and > > then re-post the EXPLAINs, with stats data. > > Well, I can of course. I actually tried to generate some rando

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-24 Thread Csaba Nagy
On Mon, 2007-09-24 at 14:27 +0100, Simon Riggs wrote: > Csaba, please can you copy that data into fresh tables, re-ANALYZE and > then re-post the EXPLAINs, with stats data. Well, I can of course. I actually tried to generate some random data with similar record count and relations between the tabl

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-24 Thread Simon Riggs
On Fri, 2007-09-21 at 19:30 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > That's not my perspective. If the LIMIT had been applied accurately to > > the cost then the hashjoin would never even have been close to the > > nested join in the first place. > > [ shrug... ] Your

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > That's not my perspective. If the LIMIT had been applied accurately to > the cost then the hashjoin would never even have been close to the > nested join in the first place. [ shrug... ] Your perspective is mistaken. There is nothing wrong with the way t

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Simon Riggs
On Fri, 2007-09-21 at 13:53 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Fri, 2007-09-21 at 12:08 -0400, Tom Lane wrote: > >> The reason you get a bad plan is that this rowcount estimate is so far > >> off: > > > That's true, but its not relevant, > > Yes it is --- the

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > On Fri, 2007-09-21 at 12:08 -0400, Tom Lane wrote: >> The reason you get a bad plan is that this rowcount estimate is so far >> off: > That's true, but its not relevant, Yes it is --- the reason it wants to use a hashjoin instead of a nestloop is exactly

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Simon Riggs
On Fri, 2007-09-21 at 12:08 -0400, Tom Lane wrote: > Csaba Nagy <[EMAIL PROTECTED]> writes: > > Looking at Plan 2, it looks like the "limit" step is estimating wrongly > > it's cost. > > The reason you get a bad plan is that this rowcount estimate is so far > off: That's true, but its not relevan

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Simon Riggs
On Fri, 2007-09-21 at 16:26 +0200, Csaba Nagy wrote: > [snip] > > Ok, I was not able to follow your explanation, it's too deep for me into > what the planner does... I'm thinking that this case is too narrow to do too much with, when I think about how we might do what I proposed. OTOH this isn't

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Tom Lane
Csaba Nagy <[EMAIL PROTECTED]> writes: > Looking at Plan 2, it looks like the "limit" step is estimating wrongly > it's cost. The reason you get a bad plan is that this rowcount estimate is so far off: >-> Index Scan using pk_table_a on table_a ta > (cost=0.00..324786.18 rows=3

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
> OK, I can confirm that. I set the statistics target for column "a" on > table_a to 1000, analyzed, and got the plan below. The only downside is > that analyze became quite expensive on table_a, it took 15 minutes and > touched half of the pages... I will experiment with lower settings, > maybe it

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
[snip] Ok, I was not able to follow your explanation, it's too deep for me into what the planner does... > Incidentally, the way out of this is to improve the stats by setting > stats target = 1000 on column a of ta. That will allow the optimizer to > have a better estimate of the tail of the dis

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Simon Riggs
On Fri, 2007-09-21 at 14:12 +0200, Csaba Nagy wrote: > On Fri, 2007-09-21 at 12:34 +0100, Simon Riggs wrote: > > On Fri, 2007-09-21 at 13:29 +0200, Csaba Nagy wrote: > > > > > > Can you plans with/without LIMIT and with/without cursor, for both b1 > > > > and b2? > > > > > > The limit is unfortun

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 12:34 +0100, Simon Riggs wrote: > On Fri, 2007-09-21 at 13:29 +0200, Csaba Nagy wrote: > > > > Can you plans with/without LIMIT and with/without cursor, for both b1 > > > and b2? > > > > The limit is unfortunately absolutely needed part of the query > > Understood, but not

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Simon Riggs
On Fri, 2007-09-21 at 13:29 +0200, Csaba Nagy wrote: > > Can you plans with/without LIMIT and with/without cursor, for both b1 > > and b2? > > The limit is unfortunately absolutely needed part of the query Understood, but not why I asked... -- Simon Riggs 2ndQuadrant http://www.2ndQuadran

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Csaba Nagy
On Fri, 2007-09-21 at 11:59 +0100, Simon Riggs wrote: > Please re-run everything on clean tables without frigging the stats. We > need to be able to trust what is happening is normal. I did, the plan fiddling happened after getting the plans after a fresh analyze, and I did run the plan again with

Re: [PERFORM] Searching for the cause of a bad plan

2007-09-21 Thread Simon Riggs
On Fri, 2007-09-21 at 12:03 +0200, Csaba Nagy wrote: > prepare test_001(bigint) as > SELECT tb.* > FROM table_a ta > JOIN table_b2 tb ON ta.b=tb.b > WHERE ta.a = $1 > ORDER BY ta.a, ta.b > limit 10; Please re-run everything on clean tables without frigging the stats. We need to be able to trus