I'm beginning to work on advanced additions to in-core replication for PostgreSQL.
There are a number of additional features for existing single-master replication still to achieve, but the key topics to be addressed are major leaps forward in functionality. I hope to add useful features in 9.3, though realise that many things could take two or even more release cycles to achieve. (The last set of features took 8 years, so I'm hoping to do this a little faster). Some of my 2ndQuadrant colleagues will be committing themselves to the project also and we hope to work with the community in the normal way to create new features. I mention this only to say that major skills and resources will be devoted to this for the next release(s), not that this is a private project. Some people have talked about the need for "multi-master replication", whereby 2+ databases communicate changes to one another. This topic has been discussed in some depth in Computer Science academic papers, most notably, "The Dangers of Replication and a Solution" by the late Jim Gray. I've further studied this to the point where I have a mathematical model of this that allows me to predict what our likely success will be from implementing that. Without meaning to worry you, MM replication alone is not a solution for large data or the general case. For the general case, single master replication will continue to be the most viable option. For large and distributed data sets, some form of partitioning/sharding is required simply because full multi-master replication just isn't viable at both volume and scale. So my take on this is that MM is desirable, but is not the only thing we need - we also need partial/filtered replication to make large systems practical. Hence why I've been calling this the "Bi-Directional Replication" project. I'm aware that paragraph alone requires lots of explanation, which I hope to do both in writing and in person at the forthcoming developer conference. My starting point for designs is to focus on a key aspect: massive change to the code base is not viable and any in-core solution must look at minimally invasive changes. And of course, if it is in-core we must also add robust, clear code with reasonable performance that do not impede non-replication usage. The use cases we will address are not focused on any one project or user. I've distilled these points so far from talking to a wide variety of users, from major enterprises to startups. 1. GEOGRAPHICALLY DISTRIBUTED - Large users require both High Availability, which necessitates multiple nodes, as well as Disaster Recovery, which necessitates geographically distributed nodes. So my focus is not focused on "clustering" in the sense of Hadoop or Oracle RAC, since those technologies require additional technologies to provide DR, so my aim is to arrive at a coherent set of technologies that provide all that we want. I'm aware that other projects *are* focused on clustering, so even more reason not to try to simultaneously invent the wheel. 2. COHERENT - With regard to the coherence, I note this thinking is similar to the way that Oracle replication is evolving, where they have multiple kinds of in-core replication and various purchased technologies. We have a similar issue with regard to various external projects. I very much hope that we can utilise the knowledge, code and expertise of those other projects in the way we move forwards. 3. ONLINE UPGRADE - highly available distributed systems must have a mechanism for online upgrade, otherwise they won't stay HA for long. This challenge must be part of the solution, and incidentally should be a useful goal in itself. 4. MULTI-MASTER - the ability to update data from a variety of locations 5. WRITE-SCALEABLE - the ability to partition data across nodes in a way that allows the solution to improve beyond the write rate of a single node. Those are the basic requirements that I am trying to address. There are a great many important details, but the core of this is probably what I would call "logical replication", that is shipping changes to other nodes in a way that does not tie us to the same physical representation that recovery/streaming replication does now. Of course, non-physical replication can take many forms. The assumption of consistency across nodes is considered optional at this point, and I hope to support both eagerly consistent and eventually consistent approaches. I'm aware that this is a broad topic and many people will want input on this, and am also aware that will take much time. This post is more about announcing the project, than discussing specific details. My strategy for doing this is to come up with some designs and prototypes of a few things that might be the best way forwards. By building prototypes we will more quickly be able to address the key questions before us. So there is currently work on research-based development to allow wider discussion based upon something more than just whiteboards. I'll be the first to explain things that don't work. I also very much agree that "one size fits all" is the wrong strategy. So there will be implementation options and parameters, and possibly even multiple replication techniques. I will also be organising a small-medium sized "Future of In-Core Replication" meeting in Ottawa on Wed 16 May, 6-10pm. To avoid this becoming an unworkably large meeting, this will be limited but is open to highly technical PostgreSQL users who share these requirements, any attendee of the main developer's meeting that wishes to attend and other developers working on PostgreSQL replication/related topics. That will also allow me to order enough pizza for everyone too. I'll send out private invites to people whom I know (no spam) and I think may be interested, but you are welcome to email me to get access. (This will take me a day or two, so don't ping me back you didn't get your invite). I'm going to do my best to include the right set of features for the majority of people, all focused on submissions to PostgreSQL core, not any external project. Best Regards -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers