Hi, Kristian!
> >> API, eliminating lots of class definitions and accessor functions.
> >> Though arguably it wouldn't really simplify the API, as the
> >> complexity would just be in understanding the THD class.
> >>
> >> For now, the API is proposed without exposing the THD class.
> >> (Similar
Sergei Golubchik writes:
> Hi, Kristian!
Hi, thanks for your comments! A couple of questions inline, and some
comments/thoughts.
> On Jun 24, Kristian Nielsen wrote:
>> At the implementation level, a lot of the work is basically to pull
>> out all of the needed information from the THD object/
Hi, Kristian!
On Jun 24, Kristian Nielsen wrote:
>
> ---
> High-Level Specification
>
> Generators and consumbers
> -
>
> We have the two concepts:
>
> 1. Event _generators_, that produce events describing
Kristian Nielsen writes:
> 1. Event generators and consumers. This is what Sergei discussed. The
> essentials of this layer is hooks in handler::write_row() and similer places
> that provides data about changes (row values for row-based replication, query
> texts for statement-based replication,
Alex Yurchenko writes:
> On Wed, 19 May 2010 15:05:55 +0200, Sergei Golubchik
> wrote:
>>
>> Yes, it only describes how the data get to the redundancy service, but
>> not what happens there. I intentionally kept the details of redundancy
>> out, to be able to satisfy a wide range of different i
Hi!
On Wed, 19 May 2010 15:05:55 +0200, Sergei Golubchik
wrote:
>
> Yes, it only describes how the data get to the redundancy service, but
> not what happens there. I intentionally kept the details of redundancy
> out, to be able to satisfy a wide range of different implementations.
>
> For exa
Hi, Alex!
On May 13, Alex Yurchenko wrote:
> On Thu, 13 May 2010 16:36:41 +0200, Sergei Golubchik
> wrote:
>
> > * there's no explicit global transaction ID here, but I presume
> > there can be a filter that adds it to events. That would work, as
> > long as replication decides on the commit
Hi, Stewart!
On May 14, Stewart Smith wrote:
>
> On Thu, 13 May 2010 16:36:41 +0200, Sergei Golubchik
> wrote:
> > * not everything can be replicated at every level, for example table
> >creation cannot be replicated row-based, InnoDB changes cannot be
> >replicated with "MyISAM pwrite
On Thu, 13 May 2010 16:36:41 +0200, Sergei Golubchik wrote:
> * not everything can be replicated at every level, for example table
>creation cannot be replicated row-based, InnoDB changes cannot be
>replicated with "MyISAM pwrite()" events
We're doing it as kinda row based ("structure ba
Hi!
On Thu, 13 May 2010 16:36:41 +0200, Sergei Golubchik
wrote:
> I may still use words "master" and "slave" below, in the sense that the
> part of the code that takes the changes generated by local clients and
> sends them out can be called "master" and the part of the code that
> receives the
Hi, Alex!
Continuing the old discussion...
On Jan 22, Alex Yurchenko wrote:
>
> 1) It is time to drop MASTER/SLAVE mentality. This has nothing to do
> with replication per se. For example multi-master Galera cluster is
> turned into master-slave simply by directing all writing transactions
> to
Hi!
> "Henrik" == Henrik Ingo writes:
Henrik> Hi Kristian
Henrik> I don't know why I'm reading this on a Sunday morning, but just a
Henrik> comment without thinking much:
Henrik> On Fri, Apr 30, 2010 at 10:32 PM, Kristian Nielsen
Henrik> wrote:
>> I was thinking about this idea of releasi
Hi Kristian
I don't know why I'm reading this on a Sunday morning, but just a
comment without thinking much:
On Fri, Apr 30, 2010 at 10:32 PM, Kristian Nielsen
wrote:
> I was thinking about this idea of releasing row locks early. Specifically
> about two scenarios: 1) releasing row locks early b
MARK CALLAGHAN writes:
> As a further optimization, I want a callback that is called after the
> binlog entries are written for a transaction and before the wait for
> group commit on the fsync is done. That callback will be used to
> release row locks (optionally) held by the transaction.
I was
On Fri, Apr 23, 2010 at 2:35 AM, Kristian Nielsen
wrote:
> MARK CALLAGHAN writes:
>
>> This is a really long thread so a summary elsewhere would be great for
>> people like me.
>
> I agree that the discussion has become quite long. I summarised the group
> commit part of it on my blog:
>
> htt
MARK CALLAGHAN writes:
> This is a really long thread so a summary elsewhere would be great for
> people like me.
I agree that the discussion has become quite long. I summarised the group
commit part of it on my blog:
http://kristiannielsen.livejournal.com/12254.html
http://kristianniel
This is a really long thread so a summary elsewhere would be great for
people like me.
I think Alex mentioned that he needs the commit protocol to be changed
so that the binlog/commit-log/commit-service/redundancy-service
guarantees commit and the storage engine does not. If that is the
case, the
Alex Yurchenko writes:
> On Mon, 29 Mar 2010 00:02:09 +0200, Kristian Nielsen
> wrote:
> The way I understood the above is that global mutex is taken in InnoDB
> prepare() solely to synchronize binlog and InnoDB commits. Is that so? If
Yes.
> it is, than it is precisely the thing we want to a
On Mon, 29 Mar 2010 00:02:09 +0200, Kristian Nielsen
wrote:
> Alex Yurchenko writes:
>
>> On Thu, 18 Mar 2010 15:18:40 +0100, Kristian Nielsen
>> wrote:
>
>> Hm, how is it different from how it is done currently in MariaDB? Does
>> txn_commit() have to follow the same order as txn_prepare()? I
Robert Hodges writes:
> In fact, you could summarize 2-6 as making the binlog (whether written to
> disk or not) into a consistent "database" that you can move elsewhere and
> apply without having to add extra metadata, such as global IDs or table
> column names. Currently we have to regenerate
Alex Yurchenko writes:
> On Thu, 18 Mar 2010 15:18:40 +0100, Kristian Nielsen
> wrote:
> Hm, how is it different from how it is done currently in MariaDB? Does
> txn_commit() have to follow the same order as txn_prepare()? If not, then
> the commit ordering imposed by redundancy service should
On Tue, 23 Mar 2010 13:03:34 +0200, Henrik Ingo
wrote:
> On Tue, Mar 23, 2010 at 10:40 AM, wrote:
>>> At least the application of replicated transactions certainly should
>>> not be part of each storage engine. From the engine point of view,
>>> applying a set of replicated transactions should b
On Tue, 23 Mar 2010 10:12:53 +0200, Henrik Ingo
wrote:
> Meta discussion first, replication discussion below :-)
I guess we can consider meta-discussion closed for now unless someone
wants to add to it. I'm content ;)
>>
>>> So those are the requirements I could derive from having NDB use our
>
On Tue, Mar 23, 2010 at 10:40 AM, wrote:
>> At least the application of replicated transactions certainly should
>> not be part of each storage engine. From the engine point of view,
>> applying a set of replicated transactions should be "just another
>> transaction". For the engine it should not
Quoting Henrik Ingo :
Meta discussion first, replication discussion below :-)
On Mon, Mar 22, 2010 at 4:41 PM, Alex Yurchenko
wrote:
Uh, I'm not sure I can accept this proposition. At least it seems
contradictory to MariaDB's vision of being a practical, user and
customer driven, database.
Meta discussion first, replication discussion below :-)
On Mon, Mar 22, 2010 at 4:41 PM, Alex Yurchenko
wrote:
>> Uh, I'm not sure I can accept this proposition. At least it seems
>> contradictory to MariaDB's vision of being a practical, user and
>> customer driven, database.
>
> I do understand
On Sat, 20 Mar 2010 13:52:47 +0200, Henrik Ingo
wrote:
> On Wed, Mar 17, 2010 at 9:01 PM, Alex Yurchenko
> wrote:
>> The problem is that you cannot really design and program by use cases,
>> unorthodox as it may sound. You cannot throw an arbitrary bunch of use
>> cases as input and get code as o
On Mon, Mar 22, 2010 at 2:47 AM, Alex Yurchenko
wrote:
> Notice however possible many-to-1 relation between redundancy plugins and
> RS and therefore - global transaction ID. So I'd suggest that a unit other
> than redundancy plugin would maintain this mapping.
Alternatively, a redundancy plugin
On Fri, 19 Mar 2010 13:16:30 +0100, Kristian Nielsen
wrote:
> Alex Yurchenko writes:
>
>> This is an interesting option indeed. However
>> 1) the mapping itself should be durable, so every plugin must design
the
>> way to recover it in the case of crash.
>> 2) global transaction ID better be int
On Wed, Mar 17, 2010 at 9:01 PM, Alex Yurchenko
wrote:
> The problem is that you cannot really design and program by use cases,
> unorthodox as it may sound. You cannot throw an arbitrary bunch of use
> cases as input and get code as output (that is in a finite time and of
> finite quality). Wheth
Alex Yurchenko writes:
> Yes, the idea of this model is that the main purpose of redundancy is
> durability, which, depending on a plugin, can be of much higher degree than
> flush to disk (e.g. binlog to a remote machine with the point-in-time
> recovery ability).
Yes.
> There is a subtle mome
On Thu, 18 Mar 2010 15:18:40 +0100, Kristian Nielsen
wrote:
> Alex, I think this discussion is getting really interesting, and I
> understand
> your points much better now, thanks for your many comments!
Glad to hear that. It'd be very sad otherwise ;)
>
> Ok, so I think this means that the re
Alex, I think this discussion is getting really interesting, and I understand
your points much better now, thanks for your many comments!
Alex Yurchenko writes:
> On Wed, 17 Mar 2010 10:48:50 +0100, Kristian Nielsen
> wrote:
>> So what is not clear to me is how the IDs get assigned to an RS, a
Hi Ingo!
Your e-mail is totally relevant and I have almost nothing there to respond
to in particular - its all as you say, I have no essential remarks. Instead
I want to respond to it in whole, thus I'll omit a lengthy quote, suffuce
say that it is a direct response.
The problem is that you canno
On Wed, 17 Mar 2010 10:48:50 +0100, Kristian Nielsen
wrote:
>>> Can you give an example of what an "RS History" would be? It was not
>> 100%
>>> clear to me.
>>
>> It is the sequence of changes that happens to RS. Like UPDATE t1
>> WHERE...;
>> INSERT INTO t2 VALUES...; etc. Perhaps you could hint
On Tue, Mar 16, 2010 at 7:32 AM, Alex Yurchenko
wrote:
> I think "a cluster that outwards presents a consistent transactional view,
> yet internally does not have a total ordering on transactions" is an
> internally contradictory concept. Suppose node1 committed T1, but not T2
> yet, and node2 com
Alex Yurchenko writes:
> On Mon, 15 Mar 2010 12:29:14 +0100, Kristian Nielsen
> wrote:
>>> One possible implementation for that can be (UUID, long long) pair.
>>
>> How is this different from (server_id, group_id)? (I'd like to
> understand).
>
> It is different in that UUID it that proposal i
On Tue, 16 Mar 2010 13:20:40 +0100, Kristian Nielsen
wrote:
> Alex Yurchenko writes:
>
>> On Mon, 15 Mar 2010 10:57:41 +0100, Kristian Nielsen
>> wrote:
>
>>> What I am wondering at the moment is if the concept of global
>> transaction
>>> ID
>>> should be a part of the new API, or if it is re
Alex Yurchenko writes:
> On Mon, 15 Mar 2010 10:57:41 +0100, Kristian Nielsen
> wrote:
>> What I am wondering at the moment is if the concept of global
> transaction
>> ID
>> should be a part of the new API, or if it is really an implemtation
> detail
>> of
>> the reduncancy service.
> I'd go
On Mon, 15 Mar 2010 12:29:14 +0100, Kristian Nielsen
wrote:
>> 2) It correctly identifies that (the part of) ID should be a monotonic
>> ordinal number.
>
> Ok. But should it increase in order of transaction start? Or in order of
> transaction commit?
I think this is easy: until transaction is
Hi Kristian,
I agree with Alex's response, and I'll pick the hopefully all the
remaining questions to answer here.
Quoting Kristian Nielsen :
So the basic for such an interface would be the ability to install
hooks to be
called with row data for every handler::write_row(), handler::update_
On Mon, 15 Mar 2010 10:57:41 +0100, Kristian Nielsen
wrote:
>
> Right.
>
> So if I understand you correctly, with "internal implementation details"
> we do
> not mean just that the APIs expose internals of the SQL server which we
> want
> to shield plugins from. Rather, the way the interface is
Robert Hodges writes:
> First of all, we Continuent Tungsten folk have a certain set of problems we
> solve with replication. Here are the key use cases:
> 3. Replicating heterogeneously between MySQL and other database like Oracle.
> This requires the ability to filter and transform data easil
Alex Yurchenko writes:
> The global transaction ID is a cornerstone concept of a any replication
> system which aspires to be pluggable, extensible and go beyond basic
> master-slave. It is hardly possible to even start designing the rest of the
> API without first setting on global transaction I
Alex Yurchenko writes:
> On Mon, 25 Jan 2010 13:55:44 +0100, Kristian Nielsen
> wrote:
>>
>> I think it would be useful if you explained what the problems are with
> that
>> interface, in your opinion.
> This interface does not seem to improve anything about how redundancy is
> achieved in MyS
I have been occupied with conferences/travel, and will be so next week
also. But I still wanted to kick off the next another of this thread.
I would like to introduce the perspective of what current replication users
are missing the most from the current implementation.
I found this blog post (by
Hi!
On Mon, 1 Feb 2010 11:06:22 +0100, Sergei Golubchik
wrote:
> Hi, Alex!
>
> On Jan 27, Alex Yurchenko wrote:
>>
>> I'll take this opportunity to put forth some theory behind the global
>> transaction IDs as we see it at Codership.
>>
>> 1. We have an abstract set of data subject to replicat
Hi Seppo,
Thanks, that was my assumption as well but life tends to a little more
complex than theory.
Cheers, Robert
On 1/30/10 1:10 AM PST, "seppo.jaak...@codership.com"
wrote:
> Hi Robert,
>
>>> Tungsten consistency checking technology works very well, and there
>>> is no need to "fix it"
Hi Robert,
Tungsten consistency checking technology works very well, and there
is no need to "fix it" in any way. However, this method is not directly
usable for multi master replication, because the target node(s) may
have committed some transactions not yet seen in the originating
master node,
Hi Seppo,
On 1/29/10 4:36 AM PST, "seppo.jaak...@codership.com"
wrote:
> Thanks Robert, this is comprehensive enough :)
>
> I'll just address the consistency checking requirement here,
> as I believe this is quite widely accepted goal as well.
>
> Tungsten uses a special consistency table for
Thanks Robert, this is comprehensive enough :)
I'll just address the consistency checking requirement here,
as I believe this is quite widely accepted goal as well.
Tungsten uses a special consistency table for passing consistency
checking information and which is treated in a special way in the
Hi Kristian,
Thanks for kicking this thread off. I have had a bit of a busy week so it
has taken a while to get around summarizing Continuent thoughts on
improvements.
First of all, we Continuent Tungsten folk have a certain set of problems we
solve with replication. Here are the key use ca
On Sun, 24 Jan 2010 14:27:05 -0800, MARK CALLAGHAN
wrote:
> On Fri, Jan 22, 2010 at 6:21 AM, Kristian Nielsen
> wrote:
>
>> Let the discussion begin!
>
> The global transaction ID project done by Justin at Google is worth
> reviewing. In addition to supporting automated slave failover it also
>
On Mon, 25 Jan 2010 11:51:24 -0800, Jeremy Zawodny
wrote:
> On Mon, Jan 25, 2010 at 11:47 AM, Alex Yurchenko <
> alexey.yurche...@codership.com> wrote:
>
>> On Mon, 25 Jan 2010 10:47:23 -0800, Jeremy Zawodny
>> wrote:
>> > If the connection between a slave and master is interrupted, the
slave
>>
Yup; that's the ideal way for WAN based replication .. But for LAN;
there is no point other than adding extra overhead
--
Thanks
Venu
Sent from iPhone
On Jan 25, 2010, at 11:47 AM, Alex Yurchenko > wrote:
On Mon, 25 Jan 2010 10:47:23 -0800, Jeremy Zawodny
wrote:
If the connection between
On Mon, Jan 25, 2010 at 11:47 AM, Alex Yurchenko <
alexey.yurche...@codership.com> wrote:
> On Mon, 25 Jan 2010 10:47:23 -0800, Jeremy Zawodny
> wrote:
> > If the connection between a slave and master is interrupted, the slave
> > won't
> > report itself as being "behind" until the slave's networ
On Mon, 25 Jan 2010 10:47:23 -0800, Jeremy Zawodny
wrote:
> If the connection between a slave and master is interrupted, the slave
> won't
> report itself as being "behind" until the slave's network timeout to the
> master expires and it reconnect (assuming it can).
>
> Jeremy
Hi Jeremy,
Does i
On Jan 25, 2010, at 12:44 PM, Jeremy Zawodny wrote:
+1 on a replication heartbeat w/configurable heartbeat time. We'd
want to issue a heartbeat pules every 1 sec in our environment to
measure *true* replication latency.
Jeremy
This is in MySQL 5.5 already. See MASTER_HEARTBEAT_PERIOD at
If the connection between a slave and master is interrupted, the slave won't
report itself as being "behind" until the slave's network timeout to the
master expires and it reconnect (assuming it can).
Jeremy
On Mon, Jan 25, 2010 at 10:44 AM, wrote:
> Hi Jeremy,
>
> Thanks for the input. I'm not
Hi Jeremy,
Thanks for the input. I'm not sure what you mean with true replication
latency here.
Anyway, the replication system can internally measure latencies
from the point where the replication event was passed for replication
until it was received/applied in the receiving end. And these laten
+1 on a replication heartbeat w/configurable heartbeat time. We'd want to
issue a heartbeat pules every 1 sec in our environment to measure *true*
replication latency.
Jeremy
On Fri, Jan 22, 2010 at 6:21 AM, Kristian Nielsen
wrote:
> The three companies Continuent, Codership, and Monty Program
On Mon, 25 Jan 2010 13:55:44 +0100, Kristian Nielsen
wrote:
>
> I think it would be useful if you explained what the problems are with
that
> interface, in your opinion.
>
Let me start with that I'm not that much familiar with the current MySQL
replication code and may not be qualified to judge
Alex Yurchenko writes:
> On Fri, 22 Jan 2010 21:14:29 +0100, Kristian Nielsen
> wrote:
>> http://forge.mysql.com/wiki/ReplicationFeatures/ReplicationInterface
> U. I am of rather poor opinion of that interface. It is not breaking
> with bad traditions at all. I'm not sure if it is high
On Fri, Jan 22, 2010 at 6:21 AM, Kristian Nielsen
wrote:
> Let the discussion begin!
The global transaction ID project done by Justin at Google is worth
reviewing. In addition to supporting automated slave failover it also
has options to make slave state crash-proof and add binlog event
checksum
On Fri, 22 Jan 2010 21:14:29 +0100, Kristian Nielsen
wrote:
> Serg pointed me to this page, which is an early description of a plugin
API
> for replication that the replication team at MySQL has implemented:
>
> http://forge.mysql.com/wiki/ReplicationFeatures/ReplicationInterface
>
U. I
Hi, Alex!
On Jan 22, Alex Yurchenko wrote:
>
> So when refactoring replication code and API we suggest to think of
> replication as of redundancy service and establish a general API for
> such service that can be utilized by different implementations with
> different qualities of service. In othe
Alex Yurchenko writes:
> So when refactoring replication code and API we suggest to think of
> replication as of redundancy service and establish a general API for such
> service that can be utilized by different implementations with different
> qualities of service. In other words - make a whole
Hi,
Kristian, thanks for starting this discussion. I'm glad you mentioned the
need to improve replication APIs. Hereby I will present some points which
we at Codership found to be essential to the success of the project. These
are not technical requirements, but more of a conceptual suggestions
pe
The three companies Continuent, Codership, and Monty Program are planning to
start working on some enhancements to the replication system in MariaDB,
together with anyone interested in joining in.
At this stage, there are no fixed directions for the project, and to do this
in as open a way possibl
69 matches
Mail list logo