On Thu, Apr 14, 2016 at 11:26 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > 1) "more deeply into core" > I'm open to doing that for some parts of the code, if there is benefit. At > present, an extension has exactly the same attributes as an in-core > solution, so I don't currently see any benefit in doing so. Could you > explain what you see? > > 2) "SQL syntax" > I'm not sure what SQL syntax would give us. I know what we would lose, which > is the ability to implement new and interesting features as extensions > before putting them into core. That doesn't strike me as a benefit, so > please explain.
Lots of things start out as extensions but then we decide that they are important enough that they should be part of the core product. For example, text search started out in contrib, but then we moved it to core. When things are in core, they can have their own DDL, which I think is an ease-of-use benefit. Also, they become accessible as infrastructure for other code that gets written later. If there were no benefits of putting features in core, we wouldn't put anything in core, but of course there are such benefits. It is absolutely wrong to say that you would "lose the ability to implement new and interesting features as extensions before putting them into core". To the contrary, as we add things to core, it becomes possible to write more and more interesting extensions. For example, the availability of background workers has opened up all kinds of interesting possibilities for extensions that didn't exist before; in fact, that's why Alvaro created the feature. Similarly, a lot of the code that I and others wrote for parallel query has been used by other people to do interesting things - and it was one of the goals of the project to make that sort of thing possible. I believe logical replication is a fundamental database technology that should be considered just as much within the score of the core product as physical replication, parallel query, or UPSERT. I held and publicly expressed that belief on my blog before anyone at 2ndQuadrant began working in this area, and I still hold it today. > At present, I don't understand why we would do sharding via FDWs, i.e. an > out-of-core solution and yet replication as an in-core solution. Sharding > desires/requires a single system image, so tight coupling is sensible (for > example, defining a distribution key column on a distributed table). For > replication between disparate loosely coupled systems, tight coupling is > exactly what you do not want. So doing it that way round would give an an > out-of-core solution for something that is best done in-core and an in-core > solution for something best done out-of-core. First, I think that replication can be either loosely-coupled or tightly-coupled. There are interesting cases with intermittently connected networks where you really don't want too much coupling, and then there are cases where you are doing load-balancing across a cluster and tight coupling is fine, even desirable. Similarly, although I agree that a sharding solution intrinsically requires fairly tight coupling, I think that one of the strengths of FDWs is that they do not. I'm not very interested in seeing a sharding solution in PostgreSQL that limits what you can do to a particular network topology and enforces tight coupling whether you want it or not. I'm more interested in seeing how we can build something that *permits* a tightly-coupled system but also lets people build other kinds of systems if they wish. Second, I don't think that whether the system is tightly-coupled or loosely-coupled has much to do with whether the code lives in src/backend or contrib and, to be clear, I don't care all that much about that, either. If we end up with a great logical replication solution and it so happens that it loads pglogical.so under the hood, fine. However, I do care about ease of use. In terms of ease of use, again, I think DDL would be a better interface than one based on functions. SQL is clunky at times, but being able to say CREATE TABLE blah (a int, b text) instead of SELECT pg_create_table('blah', ARRAY['a', 'b'], ARRAY['int'::regtype, 'text'::regtype]) surely has something to recommend it. Of course, there is a lot to ease of use other than DDL, and if we end up with a design that relies strictly on contrib, perhaps that is OK. But it needs to be just as easy to set up a replicated PostgreSQL cluster as it is to do the equivalent task in some competing product, or we are missing the boat. >> I think this would be a good topic to discuss at PGCon. > > I'll be at PgCon to discuss this. Great. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers