Robert Haas wrote:
> On Thu, Jan 20, 2011 at 4:17 AM, C?dric Villemain
> wrote:
> >>> I think his point is that we already have a proven formula
> >>> (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
> >>> The problem is to figure out what numbers to apply the M-L formula to
2011/1/19 Bruce Momjian :
> Robert Haas wrote:
>> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane wrote:
>> > Robert Haas writes:
>> >> Yeah. ?For Kevin's case, it seems like we want the caching percentage
>> >> to vary not so much based on which table we're hitting at the moment
>> >> but on how much
2011/1/20 Robert Haas :
> On Thu, Jan 20, 2011 at 4:17 AM, Cédric Villemain
> wrote:
I think his point is that we already have a proven formula
(Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
The problem is to figure out what numbers to apply the M-L formula
On Thu, Jan 20, 2011 at 4:17 AM, Cédric Villemain
wrote:
>>> I think his point is that we already have a proven formula
>>> (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
>>> The problem is to figure out what numbers to apply the M-L formula to.
>>>
>>> I've been thinking
2011/1/19 Bruce Momjian :
> Tom Lane wrote:
>> Robert Haas writes:
>> > On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain
>> > wrote:
>> >>> I wondering if we could do something with a formula like 3 *
>> >>> amount_of_data_to_read / (3 * amount_of_data_to_read +
>> >>> effective_cache_size) = pe
Tom Lane wrote:
> Robert Haas writes:
> > On Fri, Nov 12, 2010 at 4:15 AM, C?dric Villemain
> > wrote:
> >>> I wondering if we could do something with a formula like 3 *
> >>> amount_of_data_to_read / (3 * amount_of_data_to_read +
> >>> effective_cache_size) = percentage NOT cached. ?That is, if
Robert Haas wrote:
> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane wrote:
> > Robert Haas writes:
> >> Yeah. ?For Kevin's case, it seems like we want the caching percentage
> >> to vary not so much based on which table we're hitting at the moment
> >> but on how much of it we're actually reading.
> >
Tom Lane wrote:
> Mladen Gogala writes:
> > Again, having an optimizer which will choose the plan completely
> > accurately is, at least in my opinion, less important than having a
> > possibility of manual control, the aforementioned "knobs and buttons"
> > and produce the same plan for the sa
On Sat, Nov 13, 2010 at 4:32 AM, Marc Mamin wrote:
> Hello,
>
> Just a short though:
>
> Is it imaginable to compare the prognoses of the plans with the actual
> results
> and somehow log the worst cases ?
>
> a) to help the DBA locate bad statistics and queries
> b) as additional information sour
On Sat, Nov 13, 2010 at 1:32 AM, Marc Mamin wrote:
> Hello,
>
> Just a short though:
>
> Is it imaginable to compare the prognoses of the plans with the actual
> results
> and somehow log the worst cases ?
>
> a) to help the DBA locate bad statistics and queries
> b) as additional information sour
Hello,
Just a short though:
Is it imaginable to compare the prognoses of the plans with the actual
results
and somehow log the worst cases ?
a) to help the DBA locate bad statistics and queries
b) as additional information source for the planner
This could possibly affect parameters of your f
2010/11/12 Tom Lane :
> Robert Haas writes:
>> On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain
>> wrote:
I wondering if we could do something with a formula like 3 *
amount_of_data_to_read / (3 * amount_of_data_to_read +
effective_cache_size) = percentage NOT cached. That is, if
On Fri, Nov 12, 2010 at 11:43 AM, Tom Lane wrote:
> I think his point is that we already have a proven formula
> (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
> The problem is to figure out what numbers to apply the M-L formula to.
I'm not sure that's really measuring th
Robert Haas writes:
> On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain
> wrote:
>>> I wondering if we could do something with a formula like 3 *
>>> amount_of_data_to_read / (3 * amount_of_data_to_read +
>>> effective_cache_size) = percentage NOT cached. That is, if we're
>>> reading an amount
On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain
wrote:
>> I wondering if we could do something with a formula like 3 *
>> amount_of_data_to_read / (3 * amount_of_data_to_read +
>> effective_cache_size) = percentage NOT cached. That is, if we're
>> reading an amount of data equal to effective_ca
2010/11/12 Vitalii Tymchyshyn :
> 12.11.10 12:56, Cédric Villemain написав(ла):
>>
>> I supposed it was an answer to my mail but not sure... please keep
>> CC'ed people, it is easier to follow threads (at least for me)
>>
>
> OK
>>
>> 2010/11/12 Vitalii Tymchyshyn:
>>
>>>
>>> I'd say there are two
12.11.10 12:56, Cédric Villemain написав(ла):
I supposed it was an answer to my mail but not sure... please keep
CC'ed people, it is easier to follow threads (at least for me)
OK
2010/11/12 Vitalii Tymchyshyn:
I'd say there are two Qs here:
1) Modify costs based on information on how
I supposed it was an answer to my mail but not sure... please keep
CC'ed people, it is easier to follow threads (at least for me)
2010/11/12 Vitalii Tymchyshyn :
> I'd say there are two Qs here:
>
> 1) Modify costs based on information on how much of the table is in cache.
> It would be great if
I'd say there are two Qs here:
1) Modify costs based on information on how much of the table is in
cache. It would be great if this can be done, but I'd prefer to have it
as admin knobs (because of plan stability). May be both admin and
automatic ways can be followed with some parallel (disab
2010/11/11 Robert Haas :
> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane wrote:
>> Robert Haas writes:
>>> Yeah. For Kevin's case, it seems like we want the caching percentage
>>> to vary not so much based on which table we're hitting at the moment
>>> but on how much of it we're actually reading.
>
2010/11/11 Tom Lane :
> Robert Haas writes:
>> Yeah. For Kevin's case, it seems like we want the caching percentage
>> to vary not so much based on which table we're hitting at the moment
>> but on how much of it we're actually reading.
>
> Well, we could certainly take the expected number of pag
On Thu, Nov 11, 2010 at 03:56:25PM -0500, gnuo...@rcn.com wrote:
> On a thread some time ago, on a similar subject, I opined that I missed the
> ability to assign tables to tablespaces and buffers to tablespaces, thus
> having the ability to isolate needed tables (perhaps a One True Lookup Table,
Original message
>Date: Thu, 11 Nov 2010 15:29:40 -0500
>From: pgsql-performance-ow...@postgresql.org (on behalf of Robert Haas
>)
>Subject: Re: [PERFORM] anti-join chosen even when slower than old plan
>To: Tom Lane
>Cc: Kevin Grittner ,Mladen Gogala
&g
On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane wrote:
> Robert Haas writes:
>> Yeah. For Kevin's case, it seems like we want the caching percentage
>> to vary not so much based on which table we're hitting at the moment
>> but on how much of it we're actually reading.
>
> Well, we could certainly tak
On Thu, Nov 11, 2010 at 1:41 PM, Tom Lane wrote:
> "Kevin Grittner" writes:
>> Besides the "fully-scanned object size relative to relation size
>> costing adjustment" idea, the only one which seemed to be likely to
>> be useful for this sort of issue was the "costing factors by user
>> ID" idea -
On Thursday 11 November 2010 19:58:49 Tom Lane wrote:
> I wrote:
> > I do think that something based around a settable-per-table caching
> > percentage might be a reasonable way to proceed.
>
> BTW ... on reflection it seems that this would *not* solve the use-case
> Kevin described at the start o
Tom Lane wrote:
> Erm ... you can in fact do "ALTER USER SET random_page_cost"
> today.
Ouch. I'm embarrassed to have missed that. I'll do that instead of
adding those settings to the scripts, then.
Thanks for pointing that out.
-Kevin
--
Sent via pgsql-performance mailing list (pgsql-
Tom Lane wrote:
> I've tried to avoid having the planner need to know the total size
> of the database cluster, but it's kind of hard to avoid that if
> you want to model this honestly.
Agreed. Perhaps the cost could start escalating when the pages to
access hit (effective_cache_size * relati
"Kevin Grittner" writes:
> Besides the "fully-scanned object size relative to relation size
> costing adjustment" idea, the only one which seemed to be likely to
> be useful for this sort of issue was the "costing factors by user
> ID" idea -- the interactive queries hitting the well-cached portio
Robert Haas writes:
> Yeah. For Kevin's case, it seems like we want the caching percentage
> to vary not so much based on which table we're hitting at the moment
> but on how much of it we're actually reading.
Well, we could certainly take the expected number of pages to read and
compare that to
I wrote:
> Besides the "fully-scanned object size relative to relation size
> costing adjustment" idea,
s/relation size/effective cache size/
-Kevin
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/
Robert Haas writes:
> On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane wrote:
>> I do think that something based around a settable-per-table caching
>> percentage might be a reasonable way to proceed. But the devil is in
>> the details, and we don't have those yet.
> I think one of the larger devils i
Tom Lane wrote:
> BTW ... on reflection it seems that this would *not* solve the
> use-case Kevin described at the start of this thread. What he's
> got AIUI is some large tables whose recent entries are well-
> cached, and a lot of queries that tend to hit that well-cached
> portion, plus a fe
On Thu, Nov 11, 2010 at 1:58 PM, Tom Lane wrote:
> I wrote:
>> I do think that something based around a settable-per-table caching
>> percentage might be a reasonable way to proceed.
>
> BTW ... on reflection it seems that this would *not* solve the use-case
> Kevin described at the start of this
On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane wrote:
> Robert Haas writes:
>> Let's back up a moment and talk about what the overall goal is, here.
>> Ideally, we would like PostgreSQL to have excellent performance at all
>> times, under all circumstances, with minimal tuning. Therefore, we do
>> NO
I wrote:
> I do think that something based around a settable-per-table caching
> percentage might be a reasonable way to proceed.
BTW ... on reflection it seems that this would *not* solve the use-case
Kevin described at the start of this thread. What he's got AIUI is some
large tables whose rece
Robert Haas writes:
> Let's back up a moment and talk about what the overall goal is, here.
> Ideally, we would like PostgreSQL to have excellent performance at all
> times, under all circumstances, with minimal tuning. Therefore, we do
> NOT want to add variables that will, by design, need const
Tom Lane wrote:
More knobs and buttons is the Oracle way,
True. Very true.
and the end result of that
process is that you have something as hard to use as Oracle.
Also, you end up with something which is extremely reliable and
adjustable to variety of conditions.
That's
generally not
On Thu, Nov 11, 2010 at 10:00 AM, Kevin Grittner
wrote:
> Mladen Gogala wrote:
>
>> create a definitive bias toward one type of the execution plan.
>
> We're talking about trying to support the exact opposite. This all
> started because a database which was tuned for good response time
> for rel
On 11/11/10 9:13 AM, Mladen Gogala wrote:
Kevin Grittner wrote:
Mladen Gogala wrote:
create a definitive bias toward one type of the execution plan.
We're talking about trying to support the exact opposite.
I understand this, that is precisely the reason for my intervention into the
discu
Mladen Gogala writes:
> Again, having an optimizer which will choose the plan completely
> accurately is, at least in my opinion, less important than having a
> possibility of manual control, the aforementioned "knobs and buttons"
> and produce the same plan for the same statement.
More knobs
Kevin Grittner wrote:
Mladen Gogala wrote:
create a definitive bias toward one type of the execution plan.
We're talking about trying to support the exact opposite.
I understand this, that is precisely the reason for my intervention into
the discussion of experts, which I am not.
--- On Thu, 11/11/10, Mladen Gogala wrote:
> From: Mladen Gogala
> Subject: Re: [PERFORM] anti-join chosen even when slower than old plan
> To: "Kenneth Marshall"
> Cc: "Robert Haas" , "Tom Lane" ,
> "Kevin Grittner" ,
> "p
Mladen Gogala wrote:
> create a definitive bias toward one type of the execution plan.
We're talking about trying to support the exact opposite. This all
started because a database which was tuned for good response time
for relatively small queries against a "hot" portion of some tables
chose
On Thu, Nov 11, 2010 at 09:15:58AM -0500, Mladen Gogala wrote:
> Kenneth Marshall wrote:
>> I agree with the goal of avoiding the need for a GUC. This needs to
>> be as automatic as possible. One idea I had had was computing a value
>> for the amount of cache data in the system by keeping a sum or
Kenneth Marshall wrote:
I agree with the goal of avoiding the need for a GUC. This needs to
be as automatic as possible. One idea I had had was computing a value
for the amount of cache data in the system by keeping a sum or a
weighted sum of the table usage in the system. Smaller tables and
inde
On Wed, Nov 10, 2010 at 10:47:21PM -0500, Robert Haas wrote:
> On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane wrote:
> > "Kevin Grittner" writes:
> >> Robert Haas wrote:
> >>> Unfortunately, to know how much data we're going to grovel
> >>> through, we need to know the plan; and to decide on the righ
2010/11/11 Robert Haas
>
> But thinking over what you've written here, I'm reminded of something
> Peter said years ago, also about the optimizer. He was discussed the
> ratio of the estimated cost to the actual cost and made an off-hand
> remark that efforts had been made over the years to make
On 11/10/2010 5:43 PM, Kevin Grittner wrote:
The only half-sane answer I've thought of is to apply a different
cost to full-table or full-index scans based on the ratio with
effective cache size.
The "effective_cache_size" is, in my humble opinion, a wrong method. It
would be much easier to h
On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane wrote:
> "Kevin Grittner" writes:
>> Robert Haas wrote:
>>> Unfortunately, to know how much data we're going to grovel
>>> through, we need to know the plan; and to decide on the right
>>> plan, we need to know how much data we're going to grovel through
"Kevin Grittner" writes:
> Robert Haas wrote:
>> Unfortunately, to know how much data we're going to grovel
>> through, we need to know the plan; and to decide on the right
>> plan, we need to know how much data we're going to grovel through.
> And that's where they've been ending.
> The only
Robert Haas wrote:
> Wow. That's fascinating, and if you don't mind, I might mention
> this potential problem in a future talk at some point.
I don't mind at all.
> For example, in your case, it would be sufficient to estimate the
> amount of data that a given query is going to grovel throu
On Wed, Nov 10, 2010 at 10:15 AM, Kevin Grittner
wrote:
> But wait -- it turns out that this pain was self-inflicted. Based
> on heavy testing of the interactive queries which users run against
> this database we tuned the database for "fully-cached" settings,
> with both random_page_cost and _se
Grzegorz Jaśkiewicz wrote:
> you're joining on more than one key. That always hurts performance.
That's very clearly *not* the problem, as there is a plan which runs
in acceptable time but the optimizer is not choosing without being
coerced.
(1) Virtually every query we run joins on multi-col
"Kevin Grittner" writes:
> Tom Lane wrote:
>> In principle, the old-style plan ought to be equivalent to a
>> nestloop antijoin with a seqscan of DbTranLogRecord on the outside
>> and an indexscan of DbTranRepository on the inside. Can you force
>> it to choose such a plan by setting enable_merg
Tom Lane wrote:
> "Kevin Grittner" writes:
>> The semi-join and anti-join have helped us quite a bit, but we
>> have seen a situation where anti-join is chosen even though it is
>> slower than the "old fashioned" plan. I know there have been
>> other reports of this, but I just wanted to go on r
you're joining on more than one key. That always hurts performance.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
"Kevin Grittner" writes:
> The semi-join and anti-join have helped us quite a bit, but we have
> seen a situation where anti-join is chosen even though it is slower
> than the "old fashioned" plan. I know there have been other reports
> of this, but I just wanted to go on record with my details.
Tom Lane wrote:
> However, you'd have to be spending a lot of time chasing through
> long HOT chains before that would happen enough to make this a
> hotspot...
That makes it all the more mysterious, then. These tables are
insert-only except for a weekly delete of one week of the oldest
data.
"Kevin Grittner" writes:
> "Kevin Grittner" wrote:
>> samples %symbol name
>> 2320174 33.7617 index_getnext
> I couldn't resist seeing where the time went within this function.
> Over 13.7% of the opannotate run time was on this bit of code:
> /*
>* The xmin should match the p
"Kevin Grittner" wrote:
> samples %symbol name
> 2320174 33.7617 index_getnext
I couldn't resist seeing where the time went within this function.
Over 13.7% of the opannotate run time was on this bit of code:
/*
* The xmin should match the previous xmax value, else chain is
The semi-join and anti-join have helped us quite a bit, but we have
seen a situation where anti-join is chosen even though it is slower
than the "old fashioned" plan. I know there have been other reports
of this, but I just wanted to go on record with my details.
The query:
delete from "DbTranLo
62 matches
Mail list logo