On Sun, Feb 26, 2012 at 12:11 PM, Stefan Keller <sfkel...@gmail.com> wrote:
> Thanks to all who responded so far. I got some more insights from Mike > Stonebraker himself in the USENIX talk Scott pointed to before. > I'd like to revise the four points a little bit I enumerated in my > initial question and to sort out what PG already does or could do: > > 1. Buffering Pool > > To get rid of I/O bounds Mike proposes in-memory database structures. > He argues that it's impossible to be implemented by "old elephants" > because it would be a huge code rewrite since there is also a need to > store memory structures (instead disk oriented structures). > Now I'm still wondering why PG could'nt realize that probably in > combination with unlogged tables? I don't overview the respective code > but I think it's worthwhile to discuss even if implementation of > memory-oriented structures would be to difficult. > The reason is that the data structures assume disk-based data structures, so they are written to be efficient to look up on disk but not as efficient in memory. Note that VoltDB is a niche product and Stonebreaker makes this pretty clear. However, the more interesting question is what the tradeoffs are when looking at VoltDB vs Postgres-XC. > > 2. Locking > > This critique obviously does'nt hold for PG since we have MVCC here > already. > > 3. WAL logging > > Here Mike proposes replication over several nodes as an alternative to > WAL which fits nicely with High Availability. PG 9 has built-in > replication but just not for unlogged tables :-< > I find it interesting that two of the four areas he identifies have to do with durability..... > > 4. Latches > > This is an issue I never heard before. I found some notion of latches > in the code but I does'nt seem to be related to concurrently accessing > btree structures as Mike suggests. > So if anyone could confirm that this problem exists producing overhead > I'd be interested to hear. > Mike proposes single-threads running on many cores where each core > processes a non overlapping shard. > But he also calls for ideas to invent btrees which can be processed > concurrently with as less memory locks as possible (instead of looking > to make btrees faster). > > So to me the bottom line is, that PG already has reduced overhead at > least for issue #2 and perhaps for #4. > Remain issues of in-memory optimization (#2) and replication (#3) > together with High Availability to be investigated in PG. > If he were looking at PostgreSQL for #4, I think that would be stuff like waiting for semaphores... I suspect that since this work is probably really minimal and PostgreSQL is single-threaded per process, that this would be low overhead in this area. The issue seems to be concurrent access to shared data structures, which are a problem particularly when you start looking at multithreaded backends...... Best Wishes, Chris Travers