Changeset: 58f78af0693f for MonetDB URL: http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=58f78af0693f Removed Files: monetdb5/modules/mal/replication.mx Modified Files: tools/merovingian/ChangeLog.Jul2012 tools/merovingian/client/Makefile.ag tools/merovingian/client/monetdb.1 tools/merovingian/daemon/Makefile.ag tools/merovingian/daemon/forkmserver.c tools/merovingian/daemon/merovingian.c tools/merovingian/utils/properties.c Branch: default Log Message:
Merged from Jul2012 diffs (truncated from 1679 to 300 lines): diff --git a/monetdb5/modules/mal/replication.mx b/monetdb5/modules/mal/replication.mx deleted file mode 100644 --- a/monetdb5/modules/mal/replication.mx +++ /dev/null @@ -1,1490 +0,0 @@ -@/ -The contents of this file are subject to the MonetDB Public License -Version 1.1 (the "License"); you may not use this file except in -compliance with the License. You may obtain a copy of the License at -http://www.monetdb.org/Legal/MonetDBLicense - -Software distributed under the License is distributed on an "AS IS" -basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the -License for the specific language governing rights and limitations -under the License. - -The Original Code is the MonetDB Database System. - -The Initial Developer of the Original Code is CWI. -Portions created by CWI are Copyright (C) 1997-July 2008 CWI. -Copyright August 2008-2012 MonetDB B.V. -All Rights Reserved. -@ - -@f replication - -@c -/* - * @a Martin Kersten - * @v 1.0 - * @+ Database replication - * MonetDB supports a simple database replication scheme using a master-slave - * protocol. A master node keeps a log of all SQL updates for replay. - * Once a slave starts the master establishes - * a MAL-client connection to the slave and starts pumping the backlog - * of committed transactions. - * The master does not take any responsibility over the integrity of a slave. - * The master may, however, decide to suspend - * forwarding updates to prepare for e.g. administration or shutdown. - * - * It is the slave's responsibility to be resilient against duplicate - * transmission of the MAL-update backlog. A transaction id - * can be given to catch up from transactions already replayed. - * Transaction ideas before the minimum available in the log - * directory leads to freezing the slave. Then rebuilding from - * scratch is required. - * - * The replication scheme does not support SQL scheme modifications. - * Instead, the slaves should be initialized with a complete copy - * of the master schema and the database. - * - * Turning an existing database into a master and creation of a single - * slave works as follows. - * - * step 1) Turn the database into a replication master by setting its - * "master" property to true using monetdb(1). This property is translated - * by merovingian(1) into the database variable "replication_master" and is - * set upon database (re)start. Note that this setting can not be added to a - * running database. - * - * step 2) Create a dump of the master database using the msqldump(1) tool. - * - * step 3) To initiate a slave, simply load the master snapshot. - * - * step 4) Run monetdb(1) to turn the database into a slave by setting its "slave" property to the URI of the master. - * The precise URI can be obtained issuing the command - * 'mclient -lmal -dmaster -s"u := master.getURI(); io.printf(\"%s\n\", u);"' on the master. - * The slave property is translated by merovingian(1) into the database variable "replication_slave" - * and is set upon database (re)start. Note that this setting can not be added to a running database. - * - * The slave starts synchronizing with the master automatically upon each session restart. - * A few SQL wrapper procedures and functions can be used to control it manually. - * For example, the slave can temporarily suspend receiving log replays using suspendSync() - * and reactive it afterwards with resumeSync(). - * A resumeSync() is also needed if you create a relation already known by the master, - * for it could have sent updates already. Due to unavailability of the target - * table it closed the log stream. - * - * The function freezeSlaves() removes the log files and makes sure that all - * existing slaves won't be able to catch up other than by re-initializing the - * database using e.g. a checkpoint. - * @verbatim - * CREATE PROCEDURE suspendSync() EXTERNAL NAME slave."stop"; - * CREATE PROCEDURE resumeSync() EXTERNAL NAME slave."sync"; - * CREATE FUNCTION synchronizing() RETURNS boolean EXTERNAL NAME slave."synchronizing"; - * - * CREATE PROCEDURE freezeSlaves() EXTERNAL NAME master."freeze"; - * CREATE PROCEDURE suspendSlaves() EXTERNAL NAME master."stop"; - * CREATE PROCEDURE resumeSlaves() EXTERNAL NAME master."start"; - * CREATE FUNCTION master() RETURNS string EXTERNAL NAME master."getURI"; - * CREATE FUNCTION cutOffTag() RETURNS string EXTERNAL NAME master."getCutOffTag"; - * @end verbatim - * - * It is possible to make a slave database also a master for descendants. - * In such situation the database carries both a master and slave property. - * Creating such scheme allows to employ hierarchical replication, or to - * have additional tables available in the replication stream. Note that - * at this point replication from multiple masters to e.g. combine a full - * set from a set of partitioned masters is not yet possible. - * - * Beware, turning off the "master" property leads to automatic removal of all - * left-over log files. This renders the master database unusable for replication. - * The state of the slaves becomes frozen. - * To restore replication in such case, both master and - * slaves have to be reinitialised using the aforementioned steps. - * - * @- Behind the scene - * When the replication_master environment is set, an optimizer - * becomes active to look after updates on SQL tables and to prepare - * for producing the log files. The snippet below illustrates the - * modifications made to a query plan. - * - * @verbatim - * function query():void - * master:= "mapi:monetdb://gio.ins.cwi.nl:50000/dbmaster"; - * fcnid:= master.open(); - * ... - * sql.append("schema","table","col",b:[:oid,:int]); - * master.append("schema","table","col",b,fcnid); - * ... - * t := mtime.current_timestamp(); - * master.close(fcnid,t); - * end query; - * @end verbatim - * - * At runtime this leads to buffers being filled with the statements - * required for the slaves to catch up. - * Each query block is stored in its own buffer and sent at - * the end of the query block. This separates the concurrent - * actions on the database at the master and leads to a serial - * execution of the replication operations within the slave. - * - * The log records are stored in a file "dbfarm/db/master/log%d-%d" with the - * following structure: - * @verbatim - * function slave.tag1(transactionid:int,stamp:timestamp); - * barrier doit:= slave.open(transactionid); - * sql.transaction(); - * tag1_b := bat.new(:oid,:int); - * ... - * bat.insert(tag1_b,3:oid,232:int); #example update - * ... - * sql.append("schema","table","col",tag1_b,tag); - * slave.close(transactionid,stamp); - * sql.commit(); - * exit doit; - * end tag1; - * slave.tag_1(1,"2009-09-03 15:49:45.000":timestamp); - * slave.drop("tag1"); - * @end verbatim - * - * The slave.open() simply checks the replica log administration table - * and ignores duplicate attempts to roll the database forward. - * - * The operations are executed in the serial order as on the master, - * which should lead to the same optimistic transactional behavior. - * All queries are considered running in auto-commit mode, because - * the SQL frontend does not provide the hook (yet) for better transaction - * boundary control. - * The transaction identifier is part of the call to the function - * with the transaction update details. - * @- Interaction protocol - * The master node simply waits for a slave to request the transmission of the missing log files. - * The request includes the URI of the slave and the user credentials needed to establish a connection. - * The last parameter is the last known transaction id successfully re-executed. - * The master forks a thread to start flushing the blacklog files. - * - * Grouping the operations in temporary MAL functions - * makes it easy to skip its execution when we detect - * that it has been executed before. - * - * @- Log file management - * The log records are grouped into separate files. - * They are the units for re-submission and the scheme is set up to be idempotent. - * A slave always starts synchronizing using the maximal tag stored in the slave log. - * - * The log files ultimately pollute your database and have to - * be (re)moved. This is considered a responsibility for the DBA, - * for it involves making a checkpoint or securely storing the logs - * into an archive. It can be automated by asking all slaves - * for their last transaction id and purge all obsolete files. - * - * Any error recognized during the replay should freeze the slave, - * because the synchronization integrity might become compromised. - * - * Aside from being limited to autocommit transactions, the current - * implementation scheme has a hole. The log record is written just - * before transaction commit, including the activation call. - * The call and the flush of the commit record to the SQL - * log should be one atomic action, which amounts to a commit - * sequence of two 'databases'. It can only be handled when - * the SQL commit becomes visible at the MAL layer. - * [ Or, inject the transaction approval record into the log file - * when the next query starts, checking for any transaction - * errors first.] - * - * COPY INTO commands cause the master to freeze the images of - * all slaves. For capturing the input file and forwarding it to - * the slaves seems overly complicated. - * - * The slaves invalidation scheme is rather crude. The log directory - * is emptied and a new log file is created. Subsequent attempts - * by the slaves to access transactions ID before the invalidation - * are flagged as errors. - * - * @- Wishlist - * After setting the slave property, it could initiate full synchronization - * by asking for a catalog dump and replaying the logs. Provided, they - * have been kept around since the start. - * Alternatively, we can use the infrastructure for Octopus to pull the data from the master. - * For both we need msqldump functionality in the SQL code base. - * - * A slave property can be set to a list of masters, which turns the - * the slave into a serving multiple sources. It calls for splitting - * the slavelog. - * - * The tables in the slave should be set read-only, otherwise we - * have to double check integrity and bail out replication on violation. - * One solution is to store the replicated database in its own - * schema and grant read access to all users. - * [show example how to set up ] - * - * A validation script (or database diff) might be helpful to - * asses the database content for possible integrity violations. - */ -@mal -module master; - -command open():oid -address MASTERopen -comment "Create a replication record"; - -command close(tag:oid):void -address MASTERclose -comment "Close the replication record"; - -command start():void -address MASTERstart -comment "Restart synchronisation with the slaves"; - -command stop():void -address MASTERstop -comment "Stop synchronisation of the slaves"; - -command freeze():void -address MASTERfreeze -comment "Invalidate all copies maintained at slaves"; - -pattern append(mvc:ptr, s:str, t:str, c:str, :any_1, tag:oid):ptr -address MASTERappendValue -comment "Dump the scalar on the MAL log"; - -pattern append(mvc:ptr, s:str, t:str, c:str, b:bat[:oid,:any_1], tag:oid):ptr -address MASTERappend -comment "Dump the BAT on the MAL log"; - -pattern delete(s:str, t:str, b:bat[:oid,:any_1], tag:oid):void -address MASTERdelete -comment "Dump the BAT with deletions on the MAL log"; - -pattern copy(sname:str, tname:str, tsep:str, rsep:str, ssep:str, ns:str, fname:str, nr:lng, offset:lng, tag:oid):void -address MASTERcopy -comment "A copy command leads to invalidation of the slave's image. A dump restore will be required."; - -pattern replay(uri:str, usr:str, pw:str, tag:oid):void -address MASTERreplay -comment "Slave calls the master to restart sending the missing transactions -from a certain point as a named user."; - -command sync(uri:str, usr:str, pw:str, tag:oid):void -address MASTERsync -comment "Login to slave with credentials to initiate submission of the log records"; - -command getURI():str -address MASTERgetURI -comment "Return the URI for the master"; - -command getCutOffTag():oid -address MASTERgetCutOffTag -comment "Return the cutoff tag for transaction synchronization"; - -command prelude():void -address MASTERprelude -comment "Prepare the server for the master role. Or remove any leftover log files."; - -module slave; - -command sync():void -address SLAVEsyncDefault -comment "Login to master with environment credentials to initiate submission of the log records"; -command sync(uri:str):void -address SLAVEsyncURI -comment "Login to master with admin credentials to initiate submission of the log records"; -command sync(uri:str, usr:str, pw:str, tag:oid):void -address SLAVEsync -comment "Login to master uri with admin credentials to initiate submission of the log records"; - -command stop():void -address SLAVEstop -comment "Slave suspends synchronisation with master"; _______________________________________________ Checkin-list mailing list Checkin-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/checkin-list