Hi, Does the problem appear if you set the timeout value to 9223372036854775807?
On Fri, Jul 29, 2016 at 3:24 AM, Joseph Glanville <j...@jpg.id.au> wrote: > Hi Pavel. > > > To describe the setup a little better the master replicates to a semi-sync > slave, which then replicates to an async slave. This is to ensure at any > point in time both the master and the semi-sync slave have a complete copy > of the data. If the master fails the semi-sync is automatically promoted to > master and the async switches to replicating with semi-sync replication. If > the semi-sync fails then the async remasters itself to the master and > switches to semi-sync. > > > However I don't think the 3rd node has any bearing on the hang, I built a > test cluster without it and the hang is still easy to reproduce. I just > restore a decent sized dump, in this case a portion of the Wikipedia > database and the cluster reliably hangs when the master begins writing to > the new binlog. > > The dump is here if someone wants to use it to reproduce: > https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-category.sql.gz > > > I have created a gist with the output of `SHOW STATUS LIKE > 'Rpl_semi_sync%s'` on both master and slave of the simplified 2 node setup. > I have also included the binlogs of both the master and the slave and the > relay log on the slave. > > https://gist.github.com/josephglanville/70789bc9c3744090a17070652cded68b > > > <https://gist.github.com/josephglanville/70789bc9c3744090a17070652cded68b>Let > me know if there is any other useful information I can provide. > > > Joseph. > ------------------------------ > *From:* Pavel Ivanov <piva...@google.com> > *Sent:* Friday, 29 July 2016 4:31:26 PM > *To:* Joseph Glanville > *Cc:* Will Fong; maria-discuss@lists.launchpad.net > *Subject:* Re: [Maria-discuss] Semi-sync replication hangs when changing > binlog filename. > > This looks pretty weird. If you don't mind more information would be > useful to look at: contents of mariadb-bin.000005 on the master, in > particular what GTID and binlog position the transaction waiting for > semi-sync ack has (confirm that it's 0-1684280839-156 and ends at offset > 329); result of "show status like 'rpl_semi_sync_%'" on both master and > slave; contents of relay-bin.000005 and binlog on the slave, in particular > did it really execute the transaction that is currently hanging on the > master? Out of curiosity: it looks like the slave also acts as a master to > someone else. Can you also verify that the transaction hanging now on the > master made it to that second-level slave? > > But to be honest, I don't quite understand how what you show us could > happen, so I'm just asking to look at the info that I would look at if I > were investigating such problem. > > On Thu, Jul 28, 2016 at 10:52 PM, Joseph Glanville <j...@jpg.id.au> wrote: > >> Hi Pavel. >> >> Yes, by “binlog filename changes” I mean the master begins writing to a >> new binlog file. >> >> Output of all the requested commands are in this gist: >> https://gist.github.com/josephglanville/7b96c34bb6e79ace33e56627672b98a5 >> >> Joseph Glanville >> Sent from Polymail >> <https://polymail.io/?utm_source=polymail&utm_medium=referral&utm_campaign=signature> >> >> >> On Fri, 29 Jul 2016 at 3:08 PM Pavel Ivanov <Pavel Ivanov >> <pavel+ivanov+%3cpiva...@google.com%3E>> wrote: >> >>> By "binlog filename changes" you mean when master starts writing binlogs >>> into a new file? Can you clarify how the replication stalls? What "show >>> processlist" shows at that time on master and on slave? What does "show >>> slave status" show on the slave? On Thu, Jul 28, 2016 at 10:03 PM, Will >>> Fong wrote: > Hi Joseph, > > On Fri, Jul 29, 2016 at 10:11 AM, Joseph >>> Glanville wrote: >> However whenever the binlog filename changes the >>> replication stalls >> indefinitely. > > Interesting! I may have reproduced >>> this, but it was only a quick test. > Let me (or someone else) dig into >>> this more. > > Thanks for reporting this. > -will > > > -- > Will Fong, >>> Senior Support Engineer > MariaDB Corporation > > >>> _______________________________________________ > Mailing list: >>> https://launchpad.net/~maria-discuss > Post to : >>> maria-discuss@lists.launchpad.net > Unsubscribe : >>> https://launchpad.net/~maria-discuss > More help : >>> https://help.launchpad.net/ListHelp >>> >> >> > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : maria-discuss@lists.launchpad.net > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp > >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp