On Wed, Sep 15, 2010 at 3:00 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Well, the problem is to not draw the abstraction boundary so high that > your plugins have to reimplement the world to get anything done. > mysql got this wrong IMO, and are still paying the price in the form of > bizarre functional incompatibilities between their different storage > engines.
Yeah, as far as I can tell there is pretty much universal consensus that they got that wrong. Actually, I have no personal opinion on the topic, having no familiarity with the innards of MySQL: but that is what people keep telling me. > As an example, I don't think there is any sane way to provide > column-oriented storage as a plugin. The entire executor is based > around the assumption that table scans return a row at a time; in > consequence, the entire planner is too. You can't have a plugin that > replaces all of that. You could probably build a plugin that allows > columnar storage but reconstructs rows to return to the executor ... but > having to do that would largely destroy any advantages of a columnar DB, > I fear. Yeah, I don't know. A columnar DB is a bit like making "SELECT * FROM table" really mean some kind of join between table_part1, table_part2, and table_part3 (which could then perhaps be reordered, a candidate for join removal, etc.). But I have no position on whether whatever infrastructure we'd need to support that is any way related to the problem du jour. It's worth noting, however, that even if we give up on a column-oriented storage within PG, we might easily be talking to a column-oriented DB on the other end of an SQL/MED connection; and we'd like to be able to handle that sanely. > Yet there are other cases that probably *could* work well based on a > storage-level abstraction boundary; index-organized tables for instance. > So I think we need to have some realistic idea of what we want to > support and design an API accordingly, not hope that if we don't > know what we want we will somehow manage to pick an API that makes > all things possible. Agreed. Random ideas: index-organized tables, tables that use a rollback log rather than VACUUM, tables that use strict two-phase locking rather than MVCC, tables that have no concurrency control at all and you get dirty reads (could be useful for logging tables), write-once read-many tables, compressed tables, encrypted tables, tables in formats used by previous versions of PostgreSQL, tables that store data in a round-robin fashion (like MRTG rrdtool). Within the general orbit of index-organized tables, you can wonder about different kinds of indices: btree, hash, and gist all seem promising. You can even imagine a GIST-like structure that does something like maintain running totals for certain columns on each non-leaf page, to speed up SUM operations. Feel free to ignore whatever of that seems irrelevant. > I'm personally more worried about whether Heikki's sketch has the > boundary too high-level than too low-level. It might work all right > for handing off to a full-fledged remote database, particularly if > the other DB is also Postgres; but otherwise it's leaving a lot of > work to be done by the plugin author. And at the same time I don't > think it's exposing enough information to let the local planner do > anything intelligent in terms of trading off remote vs. local work. Yeah, I think the API for exposing cost information needs a lot of thought. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers