Sorry to have joined late, but Miroslav and Justin have completely coverred what I wanted to do: totally agree on everything +100 I would add one thing: take a look here https://softwaremill.com/mqperf/ Look on how Artemis is being used there: MAPPED journal (with datasync off, now on 2.6.x), but using replication to reduce the window of failure/loosing data :) FYI MAPPED journal with datasync off protect you just against application failures and considering that you're in a could environment (+ replication if needed) it could be enough.
Il giorno ven 5 ott 2018 alle ore 08:35 Miroslav Novak <[email protected]> ha scritto: > Hi Tim, > > > First, spinning up a replacement host is not instantaneous, so there will > > be a period of at least a minute but possibly several where the messages > on > > that broker and storage volume will simply be unavailable to consumers. > > In case you decide to use HA with shared store then it will take some time > for slave to start as well. It needs to load the journal directory which in > case that it has a few GBs might take some time and it's the longest part > when starting the broker. It's best practice to have a good health check on > master so you can restart new master asap. I don't recommend to use HA with > replicated journal in cloud environment because network is cloud is usually > with long latencies, unreliable and hard to configure. It's something what > is very hard to make robust and fast. > > > Second, it means that there is only one copy of a given message within > the > > broker cluster, so if that storage volume gets corrupted or fails, you've > > lost data, which would be unacceptable in some use cases. > > Leave fault tolerance and redundancy on storage of your cloud provider. It > can be backed up by some RAID for fault tolerance and possibly replicate > your storage to some backup location. As mentioned in case of HA with > shared store if storage gets corrupted then backup will not start anyway. > Thus RAID and possibly replication on storage level sounds as better > option. > > Thanks, > Mirek > > ----- Original Message ----- > > From: "Tim Bain" <[email protected]> > > To: "ActiveMQ Users" <[email protected]> > > Sent: Thursday, 4 October, 2018 3:01:52 PM > > Subject: Re: Designing for maximum Artemis performance > > > > Justin, > > > > That approach will work, to a point, but it has (at least) two failure > > cases that would be problematic. > > > > First, spinning up a replacement host is not instantaneous, so there will > > be a period of at least a minute but possibly several where the messages > on > > that broker and storage volume will simply be unavailable to consumers. > > > > Second, it means that there is only one copy of a given message within > the > > broker cluster, so if that storage volume gets corrupted or fails, you've > > lost data, which would be unacceptable in some use cases. > > > > There's also be a failure case if the number of hosts was not an even > > multiple of the number of AZs, where the new host comes up in a different > > AZ than the storage volume, and therefore can't use it. So you'd need to > be > > careful in designing the setup to avoid that potential problem. > > > > Overall I think it's better to have a slave host addressing both the > > availability and data durability concerns than to try to manage reusing > > storage volumes, but it might depend on the exact requirements for which > > approach was best. > > > > Tim > > > > On Wed, Oct 3, 2018, 2:56 PM Justin Bertram <[email protected]> wrote: > > > > > > Would it be desirable for Artemis to support this functionality in > the > > > future though, i.e. if we raised it as a feature request? > > > > > > All things being equal I'd say probably so, but I suspect the effort to > > > implement the feature might outweigh the benefits. > > > > > > > The cloud can manage spinning up another node, but the problem is > > > telling/getting the Artemis cluster to make that server the master now. > > > > > > The way I imagine it would work best is without any slave at all. The > > > whole point of the slave is to take over quickly from a live broker > that > > > has failed in such a way that all the data from the failed broker is > still > > > available to clients. Maybe I'm wrong about clouds, but I believe the > > > cloud itself can provide this functionality by quickly spinning up a > new > > > broker when one fails. So, you would have 3 live brokers in a cluster > each > > > with a separate storage node. There wouldn't be any slaves at all. > When > > > one of those brokers fails the cloud will spin up another to replace > it and > > > re-attach to the storage node so that any reconnecting client has > access to > > > all the data as before just like it would on a slave. Or is that not > how > > > clouds work? > > > > > > > > > Justin > > > > > > On Tue, Oct 2, 2018 at 10:50 PM schalmers < > > > [email protected]> wrote: > > > > > > > jbertram wrote > > > > > The master/slave/slave triplet architecture complicates fail-back > > > quite a > > > > > bit and it's not something the broker handles gracefully at this > point. > > > > > I'd recommend against using it for that reason. > > > > > > > > Would it be desirable for Artemis to support this functionality in > the > > > > future though, i.e. if we raised it as a feature request? > > > > > > > > > > > > jbertram wrote > > > > > To Clebert's point...I also don't understand why you wouldn't let > the > > > > > cloud > > > > > infrastructure deal with spinning up another live node when one > > > fails. I > > > > > was under the impression that's kind of what clouds are for. > > > > > > > > The cloud can manage spinning up another node, but the problem is > > > > telling/getting the Artemis cluster to make that server the master > now. > > > > From > > > > what I've read and been told, there's no way to failback to the > master > > > when > > > > there is already a backup for the (new) master. > > > > > > > > That's what I'm looking for help on and were my original questions. > > > > > > > > If the position from Artemis is that there's no desire for Artemis to > > > ever > > > > work that way, even if we ask/raise a feature request, then we just > need > > > to > > > > understand that so we can make design decisions in our application > stack > > > to > > > > cater for that. > > > > > > > > > > > > > > > > -- > > > > Sent from: > > > > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html > > > > > > > > > >
