Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Josh Berkus
Folks, Nope, it's definitely proportional. As a simple example, a sample of 500 rows in a table of 1000 rows should yeild stats estimates with 90%+ accuracy. But a sample of 500 rows in a 600,000,000 row table is so small as to be nearly useless; it's quite possible to get all the same val

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Josh Berkus
Greg, Only if your sample is random and independent. The existing mechanism tries fairly hard to ensure that every record has an equal chance of being selected. If you read the entire block and not appropriate samples then you'll introduce systematic sampling errors. For example, if you read an

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Josh Berkus
Trent, Sorry to interupt. The discussion is interesting, but I need some help to follow along. Thought-out commentary is welcome. Is "replace the algorithm" the same as saying "contextually use some estimate of D that is not Chaudhuri? Yes. I favor a block-based approach like Brutlag, l

Re: [HACKERS] catalog corruption bug

2006-01-04 Thread Jeremy Drake
On Wed, 21 Dec 2005, Tom Lane wrote: > Jeremy Drake <[EMAIL PROTECTED]> writes: > > We have encountered a very nasty but apparently rare bug which appears to > > result in catalog corruption. > > How much of this can you reproduce on 8.1.1? We've fixed a few issues > already. We did not see this

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Greg Stark
Simon Riggs <[EMAIL PROTECTED]> writes: > The approach I suggested uses the existing technique for selecting > random blocks, then either an exhaustive check on all of the rows in a > block or the existing random row approach, depending upon available > memory. We need to check all of the rows in

Re: [HACKERS] Heads up: upcoming back-branch re-releases

2006-01-04 Thread Tom Lane
Kris Jurka <[EMAIL PROTECTED]> writes: > Not a last minute issue or a big deal, but I see no reason for this patch > not to be applied to back branches. > http://archives.postgresql.org/pgsql-patches/2005-12/msg00128.php Well, it hasn't been applied to the *front* branch yet ... but I'll take a l

Re: [HACKERS] Improving "missing FROM-clause entry" message

2006-01-04 Thread Tom Lane
Bruce Momjian writes: > Do we have enough time to test the patch before the minor releases? Sure, it's not like it raises any portability issues. As long as it gives a better error message than before in some common cases, it'll be a step forward, even if we think of further improvements later.

Re: [HACKERS] Improving "missing FROM-clause entry" message

2006-01-04 Thread Bruce Momjian
Tom Lane wrote: > Robert Treat <[EMAIL PROTECTED]> writes: > > No objections here but since I don't use a foreign lang I figure my vote > > doesn't really matter. I was wondering though if it would be resonable to > > try > > and get some language updates into the patch release? > > With the cur

Re: [HACKERS] Heads up: upcoming back-branch re-releases

2006-01-04 Thread Andrew Dunstan
Tom Lane said: > Just FYI, the core committee has agreed we need re-releases to fix the > locale environment issue and other recent bug fixes. Current thought > is to wrap tarballs tomorrow (Thursday) evening, North American eastern > time, with public announcement scheduled for Sunday evening or

Re: [HACKERS] Heads up: upcoming back-branch re-releases

2006-01-04 Thread Kris Jurka
On Wed, 4 Jan 2006, Tom Lane wrote: Any last-minute issues out there? Not a last minute issue or a big deal, but I see no reason for this patch not to be applied to back branches. http://archives.postgresql.org/pgsql-patches/2005-12/msg00128.php It fixes this problem: http://archives.p

Re: [HACKERS] QNX mention in tuplesort.c

2006-01-04 Thread Tom Lane
Bruce Momjian writes: > There is a mention of QNX in tuplesort.c. Can that code segment be > removed now that QNX is removed? I'd leave it there. We don't have any certainty that the behavior being worked around exists only on QNX4, and it's not like it's costing a lot.

Re: [HACKERS] Improving "missing FROM-clause entry" message

2006-01-04 Thread Tom Lane
Robert Treat <[EMAIL PROTECTED]> writes: > No objections here but since I don't use a foreign lang I figure my vote > doesn't really matter. I was wondering though if it would be resonable to try > and get some language updates into the patch release? With the current re-release plans it'd take a

Re: [HACKERS] Improving "missing FROM-clause entry" message

2006-01-04 Thread Robert Treat
On Wednesday 04 January 2006 20:37, Tom Lane wrote: > A reasonable objection to either Plan A or Plan C is that it will add > error strings that are not currently in the translation message files; > which wouldn't matter for a HEAD-only patch, but I'd really like to > back-patch this into 8.1. Pla

[HACKERS] QNX mention in tuplesort.c

2006-01-04 Thread Bruce Momjian
There is a mention of QNX in tuplesort.c. Can that code segment be removed now that QNX is removed? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ

Re: [HACKERS] Improving "missing FROM-clause entry" message

2006-01-04 Thread Tom Lane
I wrote: >> I'm thinking about whether we can't improve the message for "missing >> FROM-clause entry" to somehow account for situations where the table >> does exist in the query but it's referenced from an improper place, >> ... > On further investigation, this is arguably a regression in 8.1. >

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Trent Shipley
Sorry to interupt. The discussion is interesting, but I need some help to follow along. On Wednesday 2006-01-04 17:07, Josh Berkus wrote: > Simon, > > > - Are there any performance issues that can be directly attributed to > > mis-estimation of N-Distinct ("D") by the ANALYZE command? > > Yes.

[HACKERS] Heads up: upcoming back-branch re-releases

2006-01-04 Thread Tom Lane
Just FYI, the core committee has agreed we need re-releases to fix the locale environment issue and other recent bug fixes. Current thought is to wrap tarballs tomorrow (Thursday) evening, North American eastern time, with public announcement scheduled for Sunday evening or Monday. The longer-than

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Simon Riggs
On Wed, 2006-01-04 at 19:22 -0500, Greg Stark wrote: > I think you're right that a reasonable sample size for this kind of > estimate > is going to be proportional to the table size, not the constant sized > sample > that regular statistics need. Agreed [I said exactly that in April]; the counter

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Simon Riggs
On Wed, 2006-01-04 at 17:57 -0600, Jim C. Nasby wrote: > On Wed, Jan 04, 2006 at 07:10:29PM +, Simon Riggs wrote: > > 3. We should also apply multi-column heuristics to the estimation of D, > > once we have estimated all columns. For column groups (pairs, triples > > etc) that form part of a PK

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Greg Stark
Josh Berkus writes: > Tom, > > > In general, estimating n-distinct from a sample is just plain a hard > > problem, and it's probably foolish to suppose we'll ever be able to > > do it robustly. What we need is to minimize the impact when we get > > it wrong. > > Well, I think it's pretty wel

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Simon Riggs
On Wed, 2006-01-04 at 14:49 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > [ ... a large amount of analysis based on exactly one test case ... ] [Hmmm, those are your opinions, not my words. Funny guy ;-) ] The one test case just happens to be a very common 1:M relationship,

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Josh Berkus
Simon, > - Are there any performance issues that can be directly attributed to > mis-estimation of N-Distinct ("D") by the ANALYZE command? Yes. There's at least one query (maybe two) from TPC-H which bombs because of bad N-distinct estimation, even with stats_target =1000. Based on my exper

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Jim C. Nasby
On Wed, Jan 04, 2006 at 07:10:29PM +, Simon Riggs wrote: > 3. We should also apply multi-column heuristics to the estimation of D, > once we have estimated all columns. For column groups (pairs, triples > etc) that form part of a PK, we know that it must be true that D1 * > D2 ... Dk >= N. In m

[HACKERS] back-patching locale environment fix

2006-01-04 Thread Tom Lane
I'm looking at back-patching these fixes into the release branches: http://archives.postgresql.org/pgsql-committers/2005-12/msg00479.php http://archives.postgresql.org/pgsql-committers/2006-01/msg00017.php This is mostly pretty straightforward, except that the patch depends on having unsetenv(), w

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Josh Berkus
Tom, > In general, estimating n-distinct from a sample is just plain a hard > problem, and it's probably foolish to suppose we'll ever be able to > do it robustly. What we need is to minimize the impact when we get > it wrong. Well, I think it's pretty well proven that to be accurate at all yo

Re: [HACKERS] Vacuum Blocking A Deleteion - Why?

2006-01-04 Thread Tom Lane
Brad Nicholson <[EMAIL PROTECTED]> writes: > We have been having a problem with a long running vacuum on a table > blocking deletes. We have a log table on which the only activity is an > on going series of deletions. The problem is experienced on a slony > replicated node, but I don't think

[HACKERS] Vacuum Blocking A Deleteion - Why?

2006-01-04 Thread Brad Nicholson
We have been having a problem with a long running vacuum on a table blocking deletes. We have a log table on which the only activity is an on going series of deletions. The problem is experienced on a slony replicated node, but I don't think that slony is at fault here. My question - why is

Re: [HACKERS] psql & readline & win32

2006-01-04 Thread Andrew Dunstan
Magnus Hagander said: >> > Me, I'm not fully happy with psql on win32. I want my tab >> completion! >> > (which the gui tools don't do either, from what I can tell. >> At least >> > pgadmin doesn't. Yet.) >> >> Mine has tab completion adapted from psql :). There are also >> commands for specific co

Re: [HACKERS] psql & readline & win32

2006-01-04 Thread Magnus Hagander
> > Me, I'm not fully happy with psql on win32. I want my tab > completion! > > (which the gui tools don't do either, from what I can tell. > At least > > pgadmin doesn't. Yet.) > > Mine has tab completion adapted from psql :). There are also > commands for specific completion types, e.g. comp

Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > [ ... a large amount of analysis based on exactly one test case ... ] I think you are putting too much emphasis on fixing one case and not enough on considering what may happen in other cases ... In general, estimating n-distinct from a sample is just pla

Re: [HACKERS] Inconsistent syntax in GRANT

2006-01-04 Thread Josh Berkus
Euler, > It should but it's not implemented yet. There is no difficulty in doing > it. But I want to propose the following idea: if some object depends on > another object and its type is 'DEPENDENCY_INTERNAL' we could > grant/revoke privileges automagically to it. Or maybe create another > type o

[HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-04 Thread Simon Riggs
Improving N-Distinct estimation === v1.1 OBJECTIVES Answer these points... - Are there any performance issues that can be directly attributed to mis-estimation of N-Distinct ("D") by the ANALYZE command? - If so, can we do better than we currently achieve? How? - W

Re: [HACKERS] postmaster/postgres options assimilation plan

2006-01-04 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Here's the plan for assimilating the command-line options of the postmaster > and postgres options. > ... > * postmaster options added to postgres: -h -i -k -l -n > These options will not have any useful effects, but their behavior is > consistent i

[HACKERS] postmaster/postgres options assimilation plan

2006-01-04 Thread Peter Eisentraut
Here's the plan for assimilating the command-line options of the postmaster and postgres options. I reported earlier on a couple of conflict areas; here is the full plan: * Remove: postmaster -a -b -m -M These options have done nothing forever. * postmaster options added to postgres: -h -i -k

Re: [HACKERS] psql & readline & win32

2006-01-04 Thread Tino Wildenhain
John DeSoi schrieb: On Jan 2, 2006, at 4:00 AM, Magnus Hagander wrote: Me, I'm not fully happy with psql on win32. I want my tab completion! (which the gui tools don't do either, from what I can tell. At least pgadmin doesn't. Yet.) Mine has tab completion adapted from psql :). There are al

Re: [HACKERS] psql & readline & win32

2006-01-04 Thread John DeSoi
On Jan 2, 2006, at 4:00 AM, Magnus Hagander wrote: Me, I'm not fully happy with psql on win32. I want my tab completion! (which the gui tools don't do either, from what I can tell. At least pgadmin doesn't. Yet.) Mine has tab completion adapted from psql :). There are also commands for spec

Re: [HACKERS] Incremental Backup Script

2006-01-04 Thread Gregor Zeitlinger
-Original Message- From: Zach Bagnall [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 04, 2006 4:42 AM To: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Incremental Backup Script >Gregor: can you explain how to identify the current file? I had >implemented a backup and restore