Hi Pavel.

To describe the setup a little better the master replicates to a semi-sync 
slave, which then replicates to an async slave. This is to ensure at any point 
in time both the master and the semi-sync slave have a complete copy of the 
data. If the master fails the semi-sync is automatically promoted to master and 
the async switches to replicating with semi-sync replication. If the semi-sync 
fails then the async remasters itself to the master and switches to semi-sync.


However I don't think the 3rd node has any bearing on the hang, I built a test 
cluster without it and the hang is still easy to reproduce. I just restore a 
decent sized dump, in this case a portion of the Wikipedia database and the 
cluster reliably hangs when the master begins writing to the new binlog.

The dump is here if someone wants to use it to reproduce: 
https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-category.sql.gz


I have created a gist with the output of `SHOW STATUS LIKE 'Rpl_semi_sync%s'` 
on both master and slave of the simplified 2 node setup. I have also included 
the binlogs of both the master and the slave and the relay log on the slave.

https://gist.github.com/josephglanville/70789bc9c3744090a17070652cded68b


<https://gist.github.com/josephglanville/70789bc9c3744090a17070652cded68b>Let 
me know if there is any other useful information I can provide.



Joseph.

________________________________
From: Pavel Ivanov <piva...@google.com>
Sent: Friday, 29 July 2016 4:31:26 PM
To: Joseph Glanville
Cc: Will Fong; maria-discuss@lists.launchpad.net
Subject: Re: [Maria-discuss] Semi-sync replication hangs when changing binlog 
filename.

This looks pretty weird. If you don't mind more information would be useful to 
look at: contents of mariadb-bin.000005 on the master, in particular what GTID 
and binlog position the transaction waiting for semi-sync ack has (confirm that 
it's 0-1684280839-156 and ends at offset 329); result of "show status like 
'rpl_semi_sync_%'" on both master and slave; contents of relay-bin.000005 and 
binlog on the slave, in particular did it really execute the transaction that 
is currently hanging on the master? Out of curiosity: it looks like the slave 
also acts as a master to someone else. Can you also verify that the transaction 
hanging now on the master made it to that second-level slave?

But to be honest, I don't quite understand how what you show us could happen, 
so I'm just asking to look at the info that I would look at if I were 
investigating such problem.

On Thu, Jul 28, 2016 at 10:52 PM, Joseph Glanville 
<j...@jpg.id.au<mailto:j...@jpg.id.au>> wrote:

Hi Pavel.

Yes, by "binlog filename changes" I mean the master begins writing to a new 
binlog file.

Output of all the requested commands are in this gist: 
https://gist.github.com/josephglanville/7b96c34bb6e79ace33e56627672b98a5

Joseph Glanville
Sent from 
Polymail<https://polymail.io/?utm_source=polymail&utm_medium=referral&utm_campaign=signature>


On Fri, 29 Jul 2016 at 3:08 PM Pavel Ivanov <Pavel Ivanov 
<mailto:pavel+ivanov+%3cpiva...@google.com%3E> > wrote:

By "binlog filename changes" you mean when master starts writing binlogs into a 
new file? Can you clarify how the replication stalls? What "show processlist" 
shows at that time on master and on slave? What does "show slave status" show 
on the slave? On Thu, Jul 28, 2016 at 10:03 PM, Will Fong wrote: > Hi Joseph, > 
> On Fri, Jul 29, 2016 at 10:11 AM, Joseph Glanville wrote: >> However whenever 
the binlog filename changes the replication stalls >> indefinitely. > > 
Interesting! I may have reproduced this, but it was only a quick test. > Let me 
(or someone else) dig into this more. > > Thanks for reporting this. > -will > 
> > -- > Will Fong, Senior Support Engineer > MariaDB Corporation > > 
_______________________________________________ > Mailing list: 
https://launchpad.net/~maria-discuss > Post to : 
maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> > 
Unsubscribe : https://launchpad.net/~maria-discuss > More help : 
https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to