date:20081016

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Martijn van Oosterhout

On Fri, Oct 17, 2008 at 12:20:58AM +0200, Greg Stark wrote: > Correlation is the wrong tool. In fact zip codes and city have nearly > zero correlation. Zip codes near 0 are no more likely to be in > cities starting with A than Z. I think we need to define our terms better. In terms of lin

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Martijn van Oosterhout

On Thu, Oct 16, 2008 at 09:17:03PM -0600, Joshua Tolley wrote: > Because I'm trying to picture geometrically how this might work for > the two-column case, and hoping to extend that to more dimensions, and > am finding that picturing a quantile-based system like the one we have > now in multiple di

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Joshua Tolley

On Thu, Oct 16, 2008 at 8:38 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Joshua Tolley" <[EMAIL PROTECTED]> writes: >> For what it's worth, neither version of correlation was what I had in >> mind. Statistical correlation between two variables is a single >> number, is fairly easy to calculate, and

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Tom Lane

"Joshua Tolley" <[EMAIL PROTECTED]> writes: > For what it's worth, neither version of correlation was what I had in > mind. Statistical correlation between two variables is a single > number, is fairly easy to calculate, and probably wouldn't help query > plans much at all. I'm more interested in a

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Joshua Tolley

On Thu, Oct 16, 2008 at 6:32 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > It appears to me that a lot of people in this thread are confusing > correlation in the sense of statistical correlation between two > variables with correlation in the sense of how well physically-ordered > a column is. For wh

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Tom Lane

"Joshua Tolley" <[EMAIL PROTECTED]> writes: > Most of the comments on this thread have centered around the questions > of "what we'd store" and "how we'd use it", which might be better > phrased as, "The database assumes columns are independent, but we know > that's not always true. Does this cause

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Greg Stark

This is yet another issue entirely. This is about estimating how much io will be random io if we do an index order scan. Correlation is a passable tool for this but we might be able to do better. But it has nothing to do with the cross-column stats problem. greg On 17 Oct 2008, at 01:29 AM

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Ron Mayer

Josh Berkus wrote: Yes, or to phrase that another way: What kinds of queries are being poorly optimized now and why? Well, we have two different correlation problems. One is the problem of dependant correlation, such as the 1.0 correlation of ZIP and CITY fields as a common problem. This co

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Greg Stark

Correlation is the wrong tool. In fact zip codes and city have nearly zero correlation. Zip codes near 0 are no more likely to be in cities starting with A than Z. Even if you use an appropriate tool I'm not clear what to do with the information. Consider the case of WHERE city='boston

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Josh Berkus

> Yes, or to phrase that another way: What kinds of queries are being > poorly optimized now and why? Well, we have two different correlation problems. One is the problem of dependant correlation, such as the 1.0 correlation of ZIP and CITY fields as a common problem. This could in fact be fi

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Joshua Tolley

On Thu, Oct 16, 2008 at 2:54 PM, Josh Berkus <[EMAIL PROTECTED]> wrote: > Tom, > >> (I'm not certain of how to do that efficiently, even if we had the >> right stats :-() > > I was actually talking to someone about this at pgWest. Apparently there's > a fair amount of academic algorithms devoted t

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Josh Berkus

Tom, > (I'm not certain of how to do that efficiently, even if we had the > right stats :-() I was actually talking to someone about this at pgWest. Apparently there's a fair amount of academic algorithms devoted to this topic. Josh, do you remember who was talking about this? -- --Josh Jo

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Ron Mayer

Robert Haas wrote: I think the real question is: what other kinds of correlation might people be interested in representing? Yes, or to phrase that another way: What kinds of queries are being poorly optimized now and why? The one that affects our largest tables are ones where we have an addr

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-16 Thread Simon Riggs

On Thu, 2008-10-16 at 18:52 +0300, Heikki Linnakangas wrote: > Simon Riggs wrote: > > Each backend that existed on the master is represented by a PROC > > structure in the ProcArray. These are known as "recovery procs" and are > > similar to the dummy procs used for prepared transactions. All reco

Re: [HACKERS] Memory leak on hashed agg rescan

2008-10-16 Thread Neil Conway

On Thu, Oct 16, 2008 at 5:26 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > It would probably be cleaner to take that logic out of build_hash_table > altogether, and put it in a separate function to be called by > ExecInitAgg. Yeah, I considered that -- makes sense. Attached is the patch I applied to H

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

2008-10-16 Thread Andrew Dunstan

Andrew Chernow wrote: Rainer Bauer wrote: "Matthew T. O'Connor" wrote: Tom Lane wrote: ROTFL ... so to translate: "If your program crashes, please release locks before crashing." Obviously that wasn't the intent of the above, but I guess it is the net effect. Either way, I don't think it'

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

2008-10-16 Thread Andrew Chernow

Rainer Bauer wrote: "Matthew T. O'Connor" wrote: Tom Lane wrote: ROTFL ... so to translate: "If your program crashes, please release locks before crashing." Obviously that wasn't the intent of the above, but I guess it is the net effect. Either way, I don't think it's a huge problem, it just

Re: [HACKERS] minimal update

2008-10-16 Thread Andrew Dunstan

Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: OK. Where would be a good place to put the code? Maybe a new file src/backend/utils/adt/trigger_utils.c ? I thought the plan was to make it a contrib module. Well, previous discussion did mentio

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Martijn van Oosterhout

On Thu, Oct 16, 2008 at 01:34:59PM -0400, Robert Haas wrote: > I suspect that a lot of the correlations people care about are > extreme. For example, it's fairly common for me to have a table where > column B is only used at all for certain values of column A. Like, > atm_machine_id is usually or

Re: [HACKERS] minimal update

2008-10-16 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes: > OK. Where would be a good place to put the code? Maybe a new file > src/backend/utils/adt/trigger_utils.c ? I thought the plan was to make it a contrib module. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-h

Re: [HACKERS] minimal update

2008-10-16 Thread Andrew Dunstan

Bruce Momjian wrote: Andrew Dunstan wrote: Bruce, did you ever look at completing this? No, it is still in my email box unaddressed. Feel free to work on it; I doubt I can do it for 8.4. OK. Where would be a good place to put the code? Maybe a new file src/backend/utils/adt

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Robert Haas

> I think the real question is: what other kinds of correlation might > people be interested in representing? Yes, or to phrase that another way: What kinds of queries are being poorly optimized now and why? I suspect that a lot of the correlations people care about are extreme. For example, it'

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Greg Stark

[sorry for top osting - dam phone] It's pretty straightforward to to a chi-squared test on all the pairs. But that tells you that the product is more likely to be wrong. It doesn't tell you whether it's going to be too high or too low... greg On 16 Oct 2008, at 07:20 PM, Tom Lane <[EMAIL P

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Tom Lane

Martijn van Oosterhout <[EMAIL PROTECTED]> writes: > I think you need to go a step back: how are you going to use this data? The fundamental issue as the planner sees it is not having to assume independence of WHERE clauses. For instance, given WHERE a < 5 AND b > 10 our current approac

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Martijn van Oosterhout

On Wed, Oct 15, 2008 at 04:53:10AM -0600, Joshua Tolley wrote: > I've been interested in what it would take to start tracking > cross-column statistics. A review of the mailing lists as linked from > the TODO item on the subject [1] suggests the following concerns: > > 1) What information exactly

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-16 Thread Heikki Linnakangas

Simon Riggs wrote: Each backend that existed on the master is represented by a PROC structure in the ProcArray. These are known as "recovery procs" and are similar to the dummy procs used for prepared transactions. All recovery procs are "owned by" the Startup process. So there is no process for

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

2008-10-16 Thread Rainer Bauer

"Matthew T. O'Connor" wrote: >Tom Lane wrote: >> >> ROTFL ... so to translate: "If your program crashes, please release >> locks before crashing." > >Obviously that wasn't the intent of the above, but I guess it is the net >effect. Either way, I don't think it's a huge problem, it just means >

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-16 Thread Simon Riggs

On Thu, 2008-10-16 at 15:20 +0100, Simon Riggs wrote: > I've integrated my five patches together into one now: > * recovery_infrastruc.v9.patch > * atomic_subxids.v7.patch > * hs_connect > * hs_checks > * hs_snapshot > > Seems positive that it all integrated so quickly and tests OK. > More later

Re: [HACKERS] Deriving Recovery Snapshots

2008-10-16 Thread Simon Riggs

On Thu, 2008-10-16 at 13:55 +0100, Simon Riggs wrote: > Other related patches are > * recovery_infrastruc.v9.patch > * atomic_subxids.v7.patch > They don't all apply cleanly together, but the changes are unrelated, so > those patches can still be reviewed without wasting energy. > > Next phase i

Re: [HACKERS] Annoying error messages in _dosmaperr

2008-10-16 Thread Tom Lane

ITAGAKI Takahiro <[EMAIL PROTECTED]> writes: > I grep-ed sources with #ifndef FRONTEND and #ifdef FRONTEND, > but there are no other "DEBUG or stderr" codes. All other codes > are "WARNING/LOG or stderr", so I keep all of them as-is. Looks good, applied. regards, tom lane

Re: [HACKERS] Memory leak on hashed agg rescan

2008-10-16 Thread Tom Lane

"Neil Conway" <[EMAIL PROTECTED]> writes: > I noticed a minor leak in the per-query context when ExecReScanAgg() > is called for a hashed aggregate. During rescan, build_hash_table() is > called to create a new empty hash table in the aggcontext. However, > build_hash_table() also constructs the "h

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-10-16 Thread KaiGai Kohei

KaiGai Kohei wrote: Bruce Momjian wrote: KaiGai Kohei wrote: Bruce Momjian wrote: I think we could use row-level access control to prevent people from seeing databases they should not see in pg_database. The row-level database ACL which I submitted yesterdat does not allow to assign ACLs to t

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Deriving Recovery Snapshots

Re: [HACKERS] Memory leak on hashed agg rescan

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

Re: [HACKERS] minimal update

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] minimal update

Re: [HACKERS] minimal update

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Cross-column statistics revisited

Re: [HACKERS] Deriving Recovery Snapshots

Re: [HACKERS] 8.3 .4 + Vista + MingW + initdb = ACCESS_DENIED

Re: [HACKERS] Deriving Recovery Snapshots

Re: [HACKERS] Deriving Recovery Snapshots

Re: [HACKERS] Annoying error messages in _dosmaperr

Re: [HACKERS] Memory leak on hashed agg rescan

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

32 matches

Site Navigation

Mail list logo

Footer information