Re: [HACKERS] potential bug in trigger with boolean params

2011-05-11 Thread tv
> Hi, > I was trying to create a trigger with parameters. I've found a potential > bug > when the param is boolean. > > Here is code replicating the bug: > > CREATE TABLE x(x TEXT); > > CREATE OR REPLACE FUNCTION trigger_x() RETURNS TRIGGER AS $$ > BEGIN > RETURN NEW; > END; $$ LANGUAGE PLP

Re: [HACKERS] estimating # of distinct values

2011-01-18 Thread tv
> On Jan 17, 2011, at 6:36 PM, Tomas Vondra wrote: >> 1) Forks are 'per relation' but the distinct estimators are 'per >> column' (or 'per group of columns') so I'm not sure whether the file >> should contain all the estimators for the table, or if there should >> be one fork for each estimat

Re: [HACKERS] estimating # of distinct values

2011-01-10 Thread tv
> On Fri, 2011-01-07 at 12:32 +0100, t...@fuzzy.cz wrote: >> the problem is you will eventually need to drop the results and rebuild >> it, as the algorithms do not handle deletes (ok, Florian mentioned an >> algorithm L_0 described in one of the papers, but I'm not sure we can >> use >> it). > > Y

Re: [HACKERS] estimating # of distinct values

2011-01-07 Thread tv
> On Thu, 2010-12-30 at 21:02 -0500, Tom Lane wrote: >> How is an incremental ANALYZE going to work at all? > > How about a kind of continuous analyze ? > > Instead of analyzing just once and then drop the intermediate results, > keep them on disk for all tables and then piggyback the background >

Re: [HACKERS] estimating # of distinct values

2010-12-28 Thread tv
> wrote: > >> So even with 10% of the table, there's a 10% probability to get an >> estimate that's 7x overestimated or underestimated. With lower >> probability the interval is much wider. > > Hmmm... Currently I generally feel I'm doing OK when the estimated > rows for a step are in the right o

Re: [HACKERS] estimating # of distinct values

2010-12-28 Thread tv
> >> The simple truth is >> >> 1) sampling-based estimators are a dead-end > > The Charikar and Chaudhuri paper does not, in fact, say that it is > impossible to improve sampling-based estimators as you claim it does. In > fact, the authors offer several ways to improve sampling-based > estimators.

Re: [HACKERS] proposal : cross-column stats

2010-12-24 Thread tv
> 2010/12/24 Florian Pflug : > >> On Dec23, 2010, at 20:39 , Tomas Vondra wrote: >> >>>   I guess we could use the highest possible value (equal to the number >>>   of tuples) - according to wiki you need about 10 bits per element >>>   with 1% error, i.e. about 10MB of memory for each million of >

Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
> On Dec21, 2010, at 15:51 , t...@fuzzy.cz wrote: This is the reason why they choose to always combine the values (with varying weights). >>> >>> There are no varying weights involved there. What they do is to express >>> P(A=x,B=y) once as >>> >>> ... >>> >>> P(A=x,B=y) ~= P(B=y|A=x)*P(

Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
> On Dec21, 2010, at 11:37 , t...@fuzzy.cz wrote: >> I doubt there is a way to this decision with just dist(A), dist(B) and >> dist(A,B) values. Well, we could go with a rule >> >> if [dist(A) == dist(A,B)] the [A => B] >> >> but that's very fragile. Think about estimates (we're not going to work

Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
> On Mon, Dec 20, 2010 at 9:29 PM, Florian Pflug wrote: >> You might use that to decide if either A->B or B->a looks function-like >> enough to use the uniform bayesian approach. Or you might even go >> further, >> and decide *with* bayesian formula to use - the paper you cited always >> averages

Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
> On Dec18, 2010, at 17:59 , Tomas Vondra wrote: >> It seems to me you're missing one very important thing - this was not >> meant as a new default way to do estimates. It was meant as an option >> when the user (DBA, developer, ...) realizes the current solution gives >> really bad estimates (due

Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-19 Thread tv
> Tomas Vondra writes: >> I've done several small changes to the patch, namely > >> - added docs for the functions (in SGML) >> - added the same thing for background writer > >> So I think now it's 'complete' and I'll add it to the commit fest in a >> few minutes. > > Please split this into separa

Re: [HACKERS] proposal : cross-column stats

2010-12-17 Thread tv
> On Dec17, 2010, at 23:12 , Tomas Vondra wrote: >> Well, not really - I haven't done any experiments with it. For two >> columns selectivity equation is >> >> (dist(A) * sel(A) + dist(B) * sel(B)) / (2 * dist(A,B)) >> >> where A and B are columns, dist(X) is number of distinct values in >> co

Re: [HACKERS] proposal : cross-column stats

2010-12-13 Thread tv
> On 2010-12-13 03:28, Robert Haas wrote: >> Well, I'm not real familiar with contingency tables, but it seems like >> you could end up needing to store a huge amount of data to get any >> benefit out of it, in some cases. For example, in the United States, >> there are over 40,000 postal codes, a

Re: [HACKERS] proposal : cross-column stats

2010-12-12 Thread tv
> On Sun, Dec 12, 2010 at 9:16 PM, Tomas Vondra wrote: >> Dne 13.12.2010 03:00, Robert Haas napsal(a): >>> Well, the question is what data you are actually storing.  It's >>> appealing to store a measure of the extent to which a constraint on >>> column X constrains column Y, because you'd only ne

Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-11 Thread tv
> Hello > > you have to respect pg coding style: > > a) not too long lines > b) not C++ line comments OK, thanks for the notice. I've fixed those two problems. regards Tomasdiff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 346eaaf..0ee59b1 100644 ---

[HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)

2010-12-11 Thread tv
Hi everyone, I just wrote my first patch, and I need to know whether I missed something or not. I haven't used C for a really long time, so sickbags on standby, and if you notice something really stupid don't hesitate to call me an asshole (according to Simon Phipps that proves we are a healthy op

Re: [HACKERS] [GENERAL] Postgres 9.1 - Release Theme

2010-04-01 Thread tv
> Following a great deal of discussion, I'm pleased to announce that the > PostgreSQL Core team has decided that the major theme for the 9.1 > release, due in 2011, will be 'NoSQL'. > Please, provide me your address so I can forward you the "health care" bills I had to pay due to the heart attack