On 11-11-12 03:56 PM, Michael Marrotte wrote:
An N+1 or N+X topology might be good for that cascading scenario... Find a sweet spot for the evict date.
If slave are lagging too much,  scale and tune.

Hi,

That's plainly not always possible since MySQL replication is single threaded. That's _the_ major issue with MySQL replication and although it is a pretty hot topic, nobody has really succeeded in addressing this problem.


I haven't read Yves' patch, but I'll check it out. I just saw that he was looking for slave to work with VIP and suggested a couple ways I've seen it work.

The way you suggested will work well for a relatively low load database system but since servers are stopped when they lag too much, the larger and busier installation cannot accept that. Sometimes _one_ bad query hitting a slave can cause it to lag because of locking and/or intensive disk IO. To counter that, people use many slaves.


On Sat, Nov 12, 2011 at 2:51 PM, Florian Haas <flor...@hastexo.com <mailto:flor...@hastexo.com>> wrote:

    Hi Yves and Michael,

    On 2011-11-12 19:22, Yves Trudeau wrote:
    > lol... How many large databases have you managed?  Once evicted,
    MySQL
    > will be restarted by Pacemaker so all the caches will be cold.

    If I may say so, before you start laughing at people on the list,
    it may
    be a good idea to actually get your facts straight and check what
    evict_outdated_slaves does. For a too-far-behind slave it bails out of
    monitor with $OCF_ERR_INSTALLED, which Pacemaker considers a hard
    error.
    Thus, that instance will _not_ be restarted by Pacemaker on this node
    unless an administrator intervenes.

    Still, Michael, Yves has a point that evict_outdated_slaves is not
    optimal (and I'm saying this as the guy that wrote that part of the
    agent). It's fine for a temporary problem that affects a single slave,
    but please consider this scenario:

    - High load on the database, across several instances.
    - Slaves start lagging behind.
    - We shut down a slave that is too far behind.
    - We now have _fewer_ instances to handle the same load.
    - Slaves fall further behind.
    - We shut down more slaves.

    This can turn into a cascading failure. Note, specifically, that the
    lagging slave has no real option to catch up even when the database
    isn't being hammered anymore, unless an admin has intervened and
    recovered/restarted the instance manually. And, of course, Yves' point
    about cold caches is entirely valid.

    In Yves' approach, we wouldn't shut down MySQL, but merely shift away
    the slave's virtual IP. So while clients can't connect to the
    slave via
    its virtual IP anymore, the slave can still fetch updates from the
    master -- and thus, actually has a chance to catch up. Once it's
    sufficiently caught up, it gets the VIP back and clients can talk to
    that slave again. And since we never stopped MySQL, we also don't have
    the cold cache problem.

    Yves' patches are not perfect (and they're not expected to be, that's
    what a review is for), but I think his approach is sound and shouldn't
    be shot down simply because evict_outdated_slaves is already there.

    Cheers,
    Florian

    --
    Need help with High Availability?
    http://www.hastexo.com/now

    _______________________________________________
    Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
    <mailto:Pacemaker@oss.clusterlabs.org>
    http://oss.clusterlabs.org/mailman/listinfo/pacemaker

    Project Home: http://www.clusterlabs.org
    Getting started:
    http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
    Bugs:
    http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to