Re: How can I detect the end of a replication cycle?

SGreen Mon, 19 Apr 2004 10:43:20 -0700

Hello everyone,

Sorry for the delay in responding but I had to be out of town for the last
4 days and didn't have access to my mail.

Here is a recap for those who are just joining:
I have multiple branch locations that need to have their transactions
rolled up into a common statistics database. Each branch will host a master
database replicating their transactions to a central server (multiple
slaves, one per branch). The common statistics database needs to be kept as
up-to-date as possible. The changes to the common statistics database will
be replicated in a hub-and-spoke pattern (master at the central
server/slaves at each branch).

What Garth and I have been discussing are ideas for synchronizing an
external "merge/process" agent with a slaves replication activity. By
coordinating with each slave, I can extract the new records from the
central office's slaves and add the processed data to the statistics
database in as close to real-time as possible.

> On Tue, 2004-04-13 at 20:06, [EMAIL PROTECTED] wrote:
> On Tue, 2004-04-13 at 13:13, [EMAIL PROTECTED] wrote:
> [snip]
> > >> I could frequently poll the slave servers using SHOW SLAVE STATUS or
> > SHOW
> > >> PROCESS LIST and watch for their status to change.
> > >>
> > >> I could start each slave with "--log-slave-updates" and watch the
> > slaves'
> > >> binlogs for changes.
> > >>
> > >> I could watch for the creation/deletion of the slave's relay logs.
> >
> > > This seems to indicate that you are afraid of selecting rows on the
> > > slave that are in the middle of being updated from the master.  A
single
> > > update statement is still atomic, so you don't need to poll log files
to
> > > determine if an update statement has finished.
> >
> > > On the other hand, if there is some set of multiple updates and
inserts
> > > that constitute a collection of data that you want to merge only when
> > >this collection is complete, you're better off finding a way to signal
> > > this through the database.  You could have the master lock the tables
in
> > > question until its finished and then the program quering the slave
knows
> > > that when it gets a read lock, its will see the full set of data.
You
> > > could also have a status column or a status table that has a flag
> > > letting the program on the slave side know when the data is ready.
> >
> > > If this is off the mark, maybe some example statements would help...
> >
> > [more snipping]
> >
> > Yes, I AM concerned about getting a partial update to the warehouse. I
know
> > that transactions aren't logged until after they commit. If I use
> > transactional boundaries to post multitable reports (for instance: an
> > invoice takes two tables, one for the base information and one for the
line
> > items) into the branch masters then they will arrive intact and I won't
> > corrupt the central slaves. (That's NOT the issue I am worried about!)
> >
> > I am worried that if I start processing those new records from the
slave
> > database to the Warehouse before all of the records have been processed
> > from the Relay Logs (lets say I started trying to merge records when I
see
> > the relay log being created), I could miss some data (like the last few
> > items on the invoice). That's why I am so worried about not merging
until
> > the END of the cycle. I need to be sure that everything has been
committed
> > to my central slave BEFORE I can merge the latest updates with the
> > warehouse database.
> >
> > I can lock a slave database so that it won't replicate in the middle of
my
> > merging so I know that if I can catch a slave when it goes back to
"sleep"
> > ("Waiting for master to send event"), I would have a complete set of
data.
> > I could use "--log-slave-updates" to copy the updates to the slave's
binlog
> > and check that to see if I need to merge records( if slave status is
> > "waiting" and the binlog is not "empty" then merge). Each merge could
flush
> > the binlog. However, there was a post from someone using binlogs for
> > similar purpose that said that for 4.1.x+ the binlogs vary in size
after
> > flushing so I don't know how reliable that would be as a check. How
could I
> > tell when a binlog is empty?
>
> Using the binlog in this way will lead to a race condition.  What if
> another update comes in during the few milliseconds between your 'ready'
> check (an empty binlog and a 'waiting' status) and when you select those
> rows for processing?  Plus if MySQL does any buffering of its output to
> the binlog, you could be basing your check on seconds old data, further
> aggravating the problem.
>
> The only solution I can think of that won't cause a race condition is to
> lock your tables, but your program running on the slave database would
> have to be able to connect to the master.  Your slave program would lock
> the necessary tables on the master side, wait to receive the lock, wait
> until the slave had caught up, do your merge, then release the lock.
> Likewise, the code updating tables on the master would need to lock the
> tables while they write.  You make the process a little friendlier by
> creating a separate control table that the slave and master alternately
> locked, rather than locking all the table you'll use.  That way
> processes on the master that just want to read the data don't have to
> wait for a lock.
>

Yes, my "merge" process could talk directly to my branch master databases.
If I wanted, I could even write the merge process to get the records
straight from the branch masters (and avoid all of this replication) but
that adds several live WAN connections and I am faced with a new problem:
how would I detect changes to the master databases at each of my branches
without filling up my WAN pipes with monitoring queries?

I think I am better off letting the internal replication mechanics get the
database changes to me (in the middle) and processing them there.  I spent
some time after your last response going over the source code that covers
replication. If I read it correctly, the replication process is subject to
the same locking as any other client to the database. So, I believe that if
I lock the new records with FOR UPDATE, I should be able to avoid the race.
A less elegant solution would be to lock the entire database. The least
elegant control would be to use STOP SLAVE/START SLAVE every time I need to
process new records.

However we have now returned to my original problem(s):
      a) How can I tell when a slave has started receiving records?
      b) How can I tell when a slave has just finished processing the last
record from its relay logs?
      c) How could I tell that both of those things ( a) and b) ) happened
since I last checked (assuming a really quick update and/or a monitoring
timer that is too slow to catch the slave in the middle of the act). Even
if I am checking every .01 seconds, it could be possible to miss something.
      d) Without keeping a 3rd copy of the data, how could I tell what kind
of slave update just happened? I believe that I have to use the slave's
binlog to know for certain as the slave's relay log should be erased right
after processing. Any other ideas?

Ideally I need to lockout changes to a slave as soon as possible after b)
or c) and apply the knowledge of d) to make the appropriate adjustments to
my statistics tables. Is anyone else doing this kind of thing? My fall-back
scenario is to lock-down the clients every 10 or 15 seconds and process any
changes then. However I still have to deal with issue d).

What would be an optimal combination of techniques I could use to
coordinate my slave databases and my statistics processing application?

Thank you all very much for your patience in this matter!!

Shawn

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: How can I detect the end of a replication cycle?

Reply via email to