Tom Lane wrote:
Josh Berkus <[EMAIL PROTECTED]> writes:
Oh, and wal_buffers, the default for which we should just change if it
weren't for SHMMAX.
Uh, why? On a workload of mostly small transactions, what value is
there in lots of wal_buffers?
None. But there's also little to no harm in hav
Tom Lane,
er, your explanation is reasonable.
But at least the comment
if (newtuple != tuple) /* modified by Trigger(s) */
{
.
is likely to misdirect us.
It took me a few hours to figure it out. :-(
=== 2008-06-
"billy" <[EMAIL PROTECTED]> writes:
> I think we can add some judgment conditions in function
> plpgsql_exec_trigger() to avoid this problem.
I don't especially see the point of adding extra complexity here.
AFAICS you are talking about avoiding one or two palloc/pfree
cycles, which is surely
Hi pghackers:
version 8.3.0 in function plpgsql_exec_trigger.
in a trigger, if NEW is returned as the result and we do nothing to NEW.
for example, we have a table like this:
create table test (a int);
insert into test values(1);
and a trigger like:
Tom Lane wrote:
This gets back to the discussions at PGCon about needing to have a more
explicit representation of partitioning. Right now, for a
many-partition table we spend huge amounts of time deriving the expected
behavior from first principles, each time we make a plan. And even then
we
Josh Berkus <[EMAIL PROTECTED]> wrote:
> I still see having 2 different settings:
>
> Synchronous: XID visibility is pushed to the master. Maintains synchronous
> failover, and users are expected to run *1* master to *1* slave for most
> installations.
>
> Asynchronous: replication stops on
On Tue, Jun 10, 2008 at 5:25 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Dawid Kuroczko" <[EMAIL PROTECTED]> writes:
>> As we already have four types of ALTER COLUMN .. SET STORAGE
>> { PLAIN | EXTERNAL | EXTENDED | MAIN } I would like to add
>> "COMPRESSED" which would force column compression (if
Gregory Stark <[EMAIL PROTECTED]> writes:
> It's possible that the second option I described -- teaching Append when to
> use something other than sum() -- would only work in the cases where
> constraint exclusion could be fixed though. In which case having fractional
> row counts might actually be
"Tom Lane" <[EMAIL PROTECTED]> writes:
> Gregory Stark <[EMAIL PROTECTED]> writes:
>> The screw case I've seen is when you have a large partitioned table where
>> constraint_exclusion fails to exclude the irrelevant partitions. You're going
>> to get 0 rows from all but the one partition which con
Gregory Stark <[EMAIL PROTECTED]> writes:
> The screw case I've seen is when you have a large partitioned table where
> constraint_exclusion fails to exclude the irrelevant partitions. You're going
> to get 0 rows from all but the one partition which contains the 1 row you're
> looking for. But sin
"Tom Lane" <[EMAIL PROTECTED]> writes:
> (In fact, I don't think the plan would change, in this case. The reason
> for the clamp to 1 row is to avoid foolish results for join situations.)
The screw case I've seen is when you have a large partitioned table where
constraint_exclusion fails to excl
"Nathan Boley" <[EMAIL PROTECTED]> writes:
>> (1) On what grounds do you assert the above?
> For a table with 100 non-mcv rows, the planner estimates a result
> set of cardinality 100/20 = 5, not 1.
The real problem in that situation is that you need another twenty slots
in the MCV li
>> Why? What if ( somehow ) we knew that there was a 90% chance that
>> query would return an empty result set on a big table with 20 non-mcv
>> distinct values. Currently the planner would always choose a seq scan,
>> where an index scan might be better.
>
> (1) On what grounds do you assert the
In theory, any indexed value in index (for GiST, after compression) should fit
into page at least.
So are you saying we should dedicate one page multiplied by
max_connections in shared memory? It's possible to do it that way, but
Yes, we could. Storing index keys in shared memory allows minimiz
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Am Monday, 9. June 2008 schrieb Tom Lane:
>> So while tagging the upcoming releases, I got annoyed once again about
>> what a tedious, error-prone bit of donkeywork it is.
> Could you explain what the problem is? Your script sounds like an ad hoc
>
Am Monday, 9. June 2008 schrieb Tom Lane:
> So while tagging the upcoming releases, I got annoyed once again about
> what a tedious, error-prone bit of donkeywork it is.
Could you explain what the problem is? Your script sounds like an ad hoc
workaround for some problem, but I haven't seen the p
"Nathan Boley" <[EMAIL PROTECTED]> writes:
>>> If we query on values that aren't in the table, the planner will
>>> always overestimate the expected number of returned rows because it (
>>> implicitly ) assumes that every query will return at least 1 record.
>>
>> That's intentional and should not
>> If we query on values that aren't in the table, the planner will
>> always overestimate the expected number of returned rows because it (
>> implicitly ) assumes that every query will return at least 1 record.
>
> That's intentional and should not be changed.
Why? What if ( somehow ) we knew t
"Nathan Boley" <[EMAIL PROTECTED]> writes:
> If we query on values that aren't in the table, the planner will
> always overestimate the expected number of returned rows because it (
> implicitly ) assumes that every query will return at least 1 record.
That's intentional and should not be changed.
>> >> > One more problem with low ndistinct values is that the condition might
>> >> > very well
>> >> > hit no rows at all. But Idea 1 will largely overestimate the number of
>> >> > hits.
>>
>> Thats a good point, but I don't see a clear solution. Maybe we could
>
> I think that MCVs are the so
Gregory Stark wrote:
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
...default_statistics_target?"..."Uhh 10."
Ah, but we only ever hear about the cases where it's wrong of course. In other
words even if we raised it to some optimal value we would still have precisely
the same experience of see
On Tue, 2008-06-10 at 08:51 -0700, Nathan Boley wrote:
> >> > One more problem with low ndistinct values is that the condition might
> >> > very well
> >> > hit no rows at all. But Idea 1 will largely overestimate the number of
> >> > hits.
>
> Thats a good point, but I don't see a clear solutio
Josh Berkus <[EMAIL PROTECTED]> writes:
> Oh, and wal_buffers, the default for which we should just change if it
> weren't for SHMMAX.
Uh, why? On a workload of mostly small transactions, what value is
there in lots of wal_buffers?
regards, tom lane
--
Sent via pgsql-ha
"Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> What I was thinking was just to have the script
>> print out something like
>>
>> Tagged tree as 8.3.4
>> Don't forget to run autoconf 2.59 before committing
> I like that one ...
I've checked in a script to do this --- executin
On Tuesday 10 June 2008 09:37, Josh Berkus wrote:
> Robert,
>
> > shared_buffers
> > effective_cache_size
> > default_stats_target
> > work_mem
> > maintainance_work_mem
> > listen_address
> > max_connections
> > the fsm parameters
> > checkpoint_segements
> > random_page_cost
>
> My list is very s
On Tue, 2008-06-10 at 16:59 +0400, Teodor Sigaev wrote:
> > This can be solved by my proposal, but I just don't know how it would
> > apply to something like GIN, for instance. It could replace the unique
> Hmm, your proposal isn't applicable to GIN, because GIN stores a lot of keys
> for
> only
Robert,
> shared_buffers
> effective_cache_size
> default_stats_target
> work_mem
> maintainance_work_mem
> listen_address
> max_connections
> the fsm parameters
> checkpoint_segements
> random_page_cost
My list is very similar, execept that I drop random_page_cost and add
synchronous_commit, au
Ron,
> I wonder if the fastest way to generate the configurator
> would be to simply ask everyone to post their tuned
> postgresql.conf files along with a brief description of
> the use case for that file. The we could group the
> use-cases into various classes; and average the values
> of the su
"Dave Page" <[EMAIL PROTECTED]> writes:
> Or just output offsets in every case :-p
Which is what the default ISO datestyle does ... I believe pg_dump
is careful to force ISO style for exactly this reason.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-ha
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> ... A more useful restriction would be to only output
> those that are in the set of input-acceptable abbreviations, but perhaps
> this is not easy to implement.
I think that is actually what Dave is suggesting, but I don't really
agree with it. To me
All,
> > For the slave to not interfere with the master at all, we would need to
> > delay application of WAL files on each slave until visibility on that
> > slave allows the WAL to be applied, but in that case we would have
> > long-running transactions delay data visibility of all slave session
On Tue, Jun 10, 2008 at 4:51 PM, Alvaro Herrera
<[EMAIL PROTECTED]> wrote:
> Dave Page wrote:
>> Right, but shouldn't we always output something we know we can read
>> back in (unambiguously), assuming a server with no user defined
>> abbreviations?
>
> That makes no sense because it amounts to say
Dave Page wrote:
> On Tue, Jun 10, 2008 at 4:37 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> > "Dave Page" <[EMAIL PROTECTED]> writes:
> >> It seems like a bug that we happily output PKST as a timezone (in a
> >> 'timestamp with time zone'), but won't accept it back in.
> >
> > [ shrug... ] The set o
>> > One more problem with low ndistinct values is that the condition might
>> > very well
>> > hit no rows at all. But Idea 1 will largely overestimate the number of
>> > hits.
Thats a good point, but I don't see a clear solution. Maybe we could
look at past queries
and keep track of how often
"Dave Page" <[EMAIL PROTECTED]> writes:
> On Tue, Jun 10, 2008 at 4:37 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> [ shrug... ] The set of timezone abbrevs recognized on input is
>> user-configurable, so that situation will always be possible.
> Right, but shouldn't we always output something we k
On Tue, Jun 10, 2008 at 4:37 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Dave Page" <[EMAIL PROTECTED]> writes:
>> It seems like a bug that we happily output PKST as a timezone (in a
>> 'timestamp with time zone'), but won't accept it back in.
>
> [ shrug... ] The set of timezone abbrevs recognized
"Dave Page" <[EMAIL PROTECTED]> writes:
> It seems like a bug that we happily output PKST as a timezone (in a
> 'timestamp with time zone'), but won't accept it back in.
[ shrug... ] The set of timezone abbrevs recognized on input is
user-configurable, so that situation will always be possible.
Greg,
> The analogous case in our situation is not having 300 million distinct
> values, since we're not gathering info on specific values, only the
> buckets. We need, for example, 600 samples *for each bucket*. Each bucket
> is chosen to have the same number of samples in it. So that means that
On 6/10/08, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Merlin Moncure" <[EMAIL PROTECTED]> writes:
> > IMO, support for binary is critical. Because of the interplay of the
> > array and composite out formats, the number of backslashes grows
> > exponentially (!) with nesting levels. This makes text f
Dawid Kuroczko escribió:
> Currently the TOASTing code does its magic when whole tuple is
> larger than TOAST_TUPLE_TARGET which happens to be around 2KB.
>
> There are times though when one is willing to trade using (fast) CPU to
> reduce amount of (slow) I/O. A data warehousing types of worklo
"Dawid Kuroczko" <[EMAIL PROTECTED]> writes:
> As we already have four types of ALTER COLUMN .. SET STORAGE
> { PLAIN | EXTERNAL | EXTENDED | MAIN } I would like to add
> "COMPRESSED" which would force column compression (if column is
> smaller than some minimun, I guess somwehwere between 16 and 3
"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> On 6/8/08, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
>> Tom Lane wrote:
>>> Are you intending that these operations support both text and binary
>>> results?
>>
>> I'm a bit open on that.
> IMO, support for binary is critical. Because of the interpl
On 6/8/08, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> Tom Lane wrote:
> > Andrew Dunstan <[EMAIL PROTECTED]> writes:
> > > One complicating factor I see is that there is no protocol level support
> for anything other than simple objects - each data value is simply a stream
> of bytes of a known le
hello, everyone:
version 8.3.0 in function plpgsql_exec_trigger.
in a trigger, if NEW is returned as the result and we do nothing to NEW.
for example, we have a table like this:
create table test (a int);
insert into test values(1);
and a trigge
One of our guys in Pakistan noticed a problem with Slony that seems to
have manifested itself since the last zic update. Slony uses
timeofday() as the default value for a timestamp column:
-- Executing query:
SET timezone='Asia/Karachi';
SELECT timeofday()::timestamp with time zone;
ERROR: inval
Dawid Kuroczko napsal(a):
Hello!
Currently the TOASTing code does its magic when whole tuple is
larger than TOAST_TUPLE_TARGET which happens to be around 2KB.
There are times though when one is willing to trade using (fast) CPU to
reduce amount of (slow) I/O. A data warehousing types of worklo
This can be solved by my proposal, but I just don't know how it would
apply to something like GIN, for instance. It could replace the unique
Hmm, your proposal isn't applicable to GIN, because GIN stores a lot of keys for
only one value to be indexed.
being inserted by other concurrent transac
I would like to consider adding constraints to GiST indexes. I think it
is possible to add constraints that are more sophisticated than just
UNIQUE. My use case is a non-overlapping constraint, but I think it's
possible for this to be more general.
Sounds good
The idea is to make an array in sh
Hello!
Currently the TOASTing code does its magic when whole tuple is
larger than TOAST_TUPLE_TARGET which happens to be around 2KB.
There are times though when one is willing to trade using (fast) CPU to
reduce amount of (slow) I/O. A data warehousing types of workload
most notably. Rarely use
> > I think for low ndistinct values we will want to know the exact
> > value + counts and not a bin. So I think we will want
> additional stats rows
> > that represent "value 'a1' stats".
>
> Isn't that what our most frequent values list does?
Maybe ? Do we have the relevant stats for each ?
But
"Zeugswetter Andreas OSB sIT" <[EMAIL PROTECTED]> writes:
> I think for low ndistinct values we will want to know the exact
> value + counts and not a bin. So I think we will want additional stats rows
> that represent "value 'a1' stats".
Isn't that what our most frequent values list does?
--
"Josh Berkus" <[EMAIL PROTECTED]> writes:
> Greg,
>
>> Speak to the statisticians. Our sample size is calculated using the same
>> theory behind polls which sample 600 people to learn what 250 million
>> people are going to do on election day. You do NOT need (significantly)
>> larger samples for
Gregory Stark wrote:
Instead of zeroing bytes and depending on compression why not just pass an
extra parameter to the archive command with the offset to the logical end of
data.
Because the archiver process doesn't have that information.
--
Heikki Linnakangas
EnterpriseDB http://www.ent
"Greg Smith" <[EMAIL PROTECTED]> writes:
> On Mon, 9 Jun 2008, Tom Lane wrote:
>
>> It should also be pointed out that the whole thing becomes uninteresting
>> if we get real-time log shipping implemented. So I see absolutely no
>> point in spending time integrating pg_clearxlogtail now.
>
> Ther
Tom Lane wrote:
So you are saying it is de-toasted 32880 times, in this case? If not,
where are the repeated de-toastings happening?
Inside the index support functions. I'm thinking we could fix this by
forcibly detoasting values passed as index scan keys, but it's not quite
clear where's the
> Obviously we run into problems when
> a) we have a poor estimate for ndistinct - but then we have
> worse problems
> b) our length measure doesn't correspond well with ndistinct
> in an interval
One more problem with low ndistinct values is that the condition might very well
hit no rows at all.
Tom Lane wrote:
Well, yeah, because the first thing it does is pg_detoast_datum.
Just as a cross-check, try changing it to copy the value without
forcibly detoasting --- I'll bet it's still slow then.
Yeah, that appears to be exactly the case. After some grepping through
various header files
57 matches
Mail list logo