On Fri, Jan 11, 2008 at 11:49:50AM +0000, Simon Riggs wrote: > On Fri, 2008-01-11 at 10:25 +0100, Gavin Sherry wrote: > > > > > > Of course. It's an identical situation for both. Regrettably, none of > > > your comments about dynamic partitioning and planning were accurate as a > > > result. > > > > That's not true. We will still have planning drive the partition > > selection when the predicate is immutable, thus having more accurate > > plans. > > Not really. > > The planner already evaluates stable functions at plan time to estimate > selectivity against statistics. It can do the same here. > > The boundary values can't be completely trusted at plan time because > they are dynamic, but they're at least as accurate as ANALYZE statistics > (and probably derived at identical times), so can be used as estimates. > So I don't see any reason to imagine the plans will be badly adrift.
Okay, it's good that you want the planner to look at those. Did you consider the point I made about the sheer amount of data the planner would have to consider for large cases? > We're back to saying that if the visibility map is volatile, then SE > won't help you much. I agree with that and haven't argued otherwise. > Does saying it make us throw away SE? No, at least, not yet and not for > that reason. Yes, I'm not against SE I just think that only having it would see a serious regression for larger user. Personally, I think SE would be a great idea for append only tables since it removes the thing I'm most worried about with it: the need to vacuum to 'turn it on'. > > SE does what I was looking for it to do, but doesn't do all of what > you'd like to achieve with partitioning, because we're looking at > different use cases. I'm sure you'd agree that all large databases are > not the same and that they can have very different requirements. I'd > characterise our recent positions on this that I've been focused on > archival requirements, whereas you've been focused on data warehousing. I think that sums it up, although I'd also say that declarative partitioning is suitable for all those with largish amounts of data which they know how they want stored. This points to another case that SE suits: those who don't know how or (maybe more importantly) don't care to manage their data. I'll go back to what I said above. SE looks like a good performance boost for archival read only data. If we tighten up the definitions of how some tables can be used -- append only -- then we can remove the vacuum requirement and also change other characteristics of the storage. For example, reduced visibilty information, compression, etc. These are hot topics for people with that kind of data. > The difference really lies in how much activity and of what kind occurs > on a table. You don't unload and reload archives regularly, nor do you I couldn't agree more. > perform random updates against the whole table. I'm sure we'd quickly > agree that many of the challenges you've seen recently at Greenplum > would not be possible with core Postgres, so I'm not really trying too > hard to compete. There we diverge. Yes, Greenplum produces systems for very large amounts of data, peta-byte range in fact. However, the architecture -- individual post-masters on a single CPU with their own storage -- means that we see the problems of non-distributed database users when we look at data at the node level. This is why I say that VACUUMing such systems, under the SE model, after you do a load of data is just impossible (I know that after time the cost of vacuum will stabilise but there's always the initial data load). Thanks, Gavin ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster