Re: [PERFORM] postmaster consuming /lots/ of memory with hash aggregate. why?

2010-11-11 Thread Pavel Stehule
2010/11/12 Jon Nelson : > On Thu, Nov 11, 2010 at 10:26 PM, Pavel Stehule > wrote: >> Hello >> >> look on EXPLAIN ANALYZE command. Probably your statistic are out, and >> then planner can be confused. EXPLAIN ANALYZE statement show it. > > As I noted earlier, I did set statistics to 1000 an re-ra

Re: [PERFORM] postmaster consuming /lots/ of memory with hash aggregate. why?

2010-11-11 Thread Jon Nelson
On Thu, Nov 11, 2010 at 10:26 PM, Pavel Stehule wrote: > Hello > > look on EXPLAIN ANALYZE command. Probably your statistic are out, and > then planner can be confused. EXPLAIN ANALYZE statement show it. As I noted earlier, I did set statistics to 1000 an re-ran vacuum analyze and the plan did no

Re: [PERFORM] postmaster consuming /lots/ of memory with hash aggregate. why?

2010-11-11 Thread Pavel Stehule
Hello look on EXPLAIN ANALYZE command. Probably your statistic are out, and then planner can be confused. EXPLAIN ANALYZE statement show it. Regards Pavel Stehule 2010/11/12 Jon Nelson : > On Fri, Nov 5, 2010 at 7:26 PM, Jon Nelson wrote: >> I have a postgres 8.4.5 instance on CentOS 5 (x86_64

Re: [PERFORM] postmaster consuming /lots/ of memory with hash aggregate. why?

2010-11-11 Thread Jon Nelson
On Fri, Nov 5, 2010 at 7:26 PM, Jon Nelson wrote: > I have a postgres 8.4.5 instance on CentOS 5 (x86_64) which appears to > go crazy with the amount of memory it consumes. > When I run the query below, in a matter of a few seconds memory > balloons to 5.3G (virtual), 4.6G (resident) and 1840 (sha

Re: [PERFORM] equivalent queries lead to different query plans for self-joins with group by?

2010-11-11 Thread Ben
appreciate the instant response. > Well, arguably it's not doing the right thing either way --- you'd sort > of like the inequalities to get pushed down into both of the join > inputs, not just one of them. PG doesn't make that deduction though; > it can make such inferences for equalities, but i

Re: [PERFORM] equivalent queries lead to different query plans for self-joins with group by?

2010-11-11 Thread Tom Lane
Ben writes: > us_quotedb=# explain select ts, id, val - aval from foo join (select ts, > avg(val) as aval from foo group by ts) as a using (ts) where ts > > '2010-11-11' and ts < '2010-11-13'; > >

[PERFORM] equivalent queries lead to different query plans for self-joins with group by?

2010-11-11 Thread Ben
dear pgers -- consider the following toy example (using pg 8.4.3) : create temporary table foo ( ts timestamp not null, id integer not null, val double precision not null, primary key (ts, id) ); i might want to return the vals, minus the averages at each timestamp. the obvious self-jo

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall
On Thu, Nov 11, 2010 at 03:56:25PM -0500, gnuo...@rcn.com wrote: > On a thread some time ago, on a similar subject, I opined that I missed the > ability to assign tables to tablespaces and buffers to tablespaces, thus > having the ability to isolate needed tables (perhaps a One True Lookup Table,

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread gnuoytr
Original message >Date: Thu, 11 Nov 2010 15:29:40 -0500 >From: pgsql-performance-ow...@postgresql.org (on behalf of Robert Haas >) >Subject: Re: [PERFORM] anti-join chosen even when slower than old plan >To: Tom Lane >Cc: Kevin Grittner ,Mladen Gogala >,"pgsql-performance@postgresql.

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane wrote: > Robert Haas writes: >> Yeah.  For Kevin's case, it seems like we want the caching percentage >> to vary not so much based on which table we're hitting at the moment >> but on how much of it we're actually reading. > > Well, we could certainly tak

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Jon Nelson
On Thu, Nov 11, 2010 at 1:41 PM, Tom Lane wrote: > "Kevin Grittner" writes: >> Besides the "fully-scanned object size relative to relation size >> costing adjustment" idea, the only one which seemed to be likely to >> be useful for this sort of issue was the "costing factors by user >> ID" idea -

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Andres Freund
On Thursday 11 November 2010 19:58:49 Tom Lane wrote: > I wrote: > > I do think that something based around a settable-per-table caching > > percentage might be a reasonable way to proceed. > > BTW ... on reflection it seems that this would *not* solve the use-case > Kevin described at the start o

Re: [PERFORM] CREATE INDEX as bottleneck

2010-11-11 Thread Marc Mamin
No, CONCURRENTLY is to improve table availability during index creation, but it degrades the performances. best regards, Marc Mamin -Original Message- From: Alex Hunsaker [mailto:bada...@gmail.com] Sent: Donnerstag, 11. November 2010 19:55 To: Marc Mamin Cc: pgsql-performance@postgres

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Tom Lane wrote: > Erm ... you can in fact do "ALTER USER SET random_page_cost" > today. Ouch. I'm embarrassed to have missed that. I'll do that instead of adding those settings to the scripts, then. Thanks for pointing that out. -Kevin -- Sent via pgsql-performance mailing list (pgsql-

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Tom Lane wrote: > I've tried to avoid having the planner need to know the total size > of the database cluster, but it's kind of hard to avoid that if > you want to model this honestly. Agreed. Perhaps the cost could start escalating when the pages to access hit (effective_cache_size * relati

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
"Kevin Grittner" writes: > Besides the "fully-scanned object size relative to relation size > costing adjustment" idea, the only one which seemed to be likely to > be useful for this sort of issue was the "costing factors by user > ID" idea -- the interactive queries hitting the well-cached portio

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Robert Haas writes: > Yeah. For Kevin's case, it seems like we want the caching percentage > to vary not so much based on which table we're hitting at the moment > but on how much of it we're actually reading. Well, we could certainly take the expected number of pages to read and compare that to

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
I wrote: > Besides the "fully-scanned object size relative to relation size > costing adjustment" idea, s/relation size/effective cache size/ -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Robert Haas writes: > On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane wrote: >> I do think that something based around a settable-per-table caching >> percentage might be a reasonable way to proceed.  But the devil is in >> the details, and we don't have those yet. > I think one of the larger devils i

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Tom Lane wrote: > BTW ... on reflection it seems that this would *not* solve the > use-case Kevin described at the start of this thread. What he's > got AIUI is some large tables whose recent entries are well- > cached, and a lot of queries that tend to hit that well-cached > portion, plus a fe

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 1:58 PM, Tom Lane wrote: > I wrote: >> I do think that something based around a settable-per-table caching >> percentage might be a reasonable way to proceed. > > BTW ... on reflection it seems that this would *not* solve the use-case > Kevin described at the start of this

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane wrote: > Robert Haas writes: >> Let's back up a moment and talk about what the overall goal is, here. >> Ideally, we would like PostgreSQL to have excellent performance at all >> times, under all circumstances, with minimal tuning.  Therefore, we do >> NO

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
I wrote: > I do think that something based around a settable-per-table caching > percentage might be a reasonable way to proceed. BTW ... on reflection it seems that this would *not* solve the use-case Kevin described at the start of this thread. What he's got AIUI is some large tables whose rece

Re: [PERFORM] CREATE INDEX as bottleneck

2010-11-11 Thread Alex Hunsaker
On Thu, Nov 11, 2010 at 06:41, Marc Mamin wrote: > There are a few places in our data flow where we have to wait for index > creation before being able to distribute the process on multiple threads > again. Would CREATE INDEX CONCURRENTLY help here? -- Sent via pgsql-performance mailing list (p

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Robert Haas writes: > Let's back up a moment and talk about what the overall goal is, here. > Ideally, we would like PostgreSQL to have excellent performance at all > times, under all circumstances, with minimal tuning. Therefore, we do > NOT want to add variables that will, by design, need const

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Mladen Gogala
Tom Lane wrote: More knobs and buttons is the Oracle way, True. Very true. and the end result of that process is that you have something as hard to use as Oracle. Also, you end up with something which is extremely reliable and adjustable to variety of conditions. That's generally not

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 10:00 AM, Kevin Grittner wrote: > Mladen Gogala wrote: > >> create a definitive bias toward one type of the execution plan. > > We're talking about trying to support the exact opposite.  This all > started because a database which was tuned for good response time > for rel

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Craig James
On 11/11/10 9:13 AM, Mladen Gogala wrote: Kevin Grittner wrote: Mladen Gogala wrote: create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite. I understand this, that is precisely the reason for my intervention into the discu

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Mladen Gogala writes: > Again, having an optimizer which will choose the plan completely > accurately is, at least in my opinion, less important than having a > possibility of manual control, the aforementioned "knobs and buttons" > and produce the same plan for the same statement. More knobs

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Mladen Gogala
Kevin Grittner wrote: Mladen Gogala wrote: create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite. I understand this, that is precisely the reason for my intervention into the discussion of experts, which I am not.

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Bob Lunney
--- On Thu, 11/11/10, Mladen Gogala wrote: > From: Mladen Gogala > Subject: Re: [PERFORM] anti-join chosen even when slower than old plan > To: "Kenneth Marshall" > Cc: "Robert Haas" , "Tom Lane" , > "Kevin Grittner" , > "pgsql-performance@postgresql.org" > Date: Thursday, November 11, 2010

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Mladen Gogala wrote: > create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite. This all started because a database which was tuned for good response time for relatively small queries against a "hot" portion of some tables chose

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall
On Thu, Nov 11, 2010 at 09:15:58AM -0500, Mladen Gogala wrote: > Kenneth Marshall wrote: >> I agree with the goal of avoiding the need for a GUC. This needs to >> be as automatic as possible. One idea I had had was computing a value >> for the amount of cache data in the system by keeping a sum or

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Mladen Gogala
Kenneth Marshall wrote: I agree with the goal of avoiding the need for a GUC. This needs to be as automatic as possible. One idea I had had was computing a value for the amount of cache data in the system by keeping a sum or a weighted sum of the table usage in the system. Smaller tables and inde

Re: [PERFORM] CREATE INDEX as bottleneck

2010-11-11 Thread Kenneth Marshall
On Thu, Nov 11, 2010 at 02:41:12PM +0100, Marc Mamin wrote: > Hello, > > in the last years, we have successfully manage to cope with our data > growth > using partitioning and splitting large aggregation tasks on multiple > threads. > The partitioning is done logically by our applicationn server,

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall
On Wed, Nov 10, 2010 at 10:47:21PM -0500, Robert Haas wrote: > On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane wrote: > > "Kevin Grittner" writes: > >> Robert Haas wrote: > >>> Unfortunately, to know how much data we're going to grovel > >>> through, we need to know the plan; and to decide on the righ

[PERFORM] CREATE INDEX as bottleneck

2010-11-11 Thread Marc Mamin
Hello, in the last years, we have successfully manage to cope with our data growth using partitioning and splitting large aggregation tasks on multiple threads. The partitioning is done logically by our applicationn server, thus avoiding trigger overhead. There are a few places in our data flow

Re: [PERFORM] Why dose the planner select one bad scan plan.

2010-11-11 Thread tv
> But I doubt your answer. I think the essence of the problem is when the > planner selects 'Bitmap Index Scan' and how the planner computes the cost > of 'Bitmap Index Scan'. The essence of the problem obviously is a bad estimate of the cost. The planner considers the two plans, computes the cost

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Віталій Тимчишин
2010/11/11 Robert Haas > > But thinking over what you've written here, I'm reminded of something > Peter said years ago, also about the optimizer. He was discussed the > ratio of the estimated cost to the actual cost and made an off-hand > remark that efforts had been made over the years to make