On Wed, Jul 27, 2016 at 1:04 PM, Andy Zhou <az...@ovn.org> wrote:
> > > On Tue, Jul 26, 2016 at 6:20 PM, Russell Bryant <russ...@ovn.org> wrote: > >> >> >> On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou <az...@ovn.org> wrote: >> >>> >>> >>> On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant <russ...@ovn.org> >>> wrote: >>> >>>> >>>> >>>> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou <az...@ovn.org> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant <russ...@ovn.org> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou <az...@ovn.org> wrote: >>>>>> >>>>>>> Hi, Rayn and Russell, >>>>>>> >>>>>> >>>>>> Can we move this discussion to the ovs dev mailing list? Feel free >>>>>> to just add it in a reply if you'd like. >>>>>> >>>>> Done. >>>>> >>>>>> >>>>>> >>>>>>> I am wondering how we can actually use the active/backup feature >>>>>>> that is now part of >>>>>>> OVSDB to increase OVN availability. >>>>>>> >>>>>> >>>>>> TO be clear, I haven't actually tried this yet. I'm only speaking >>>>>> about how I think it should work. >>>>>> >>>>>> >>>>>>> Specifically: >>>>>>> >>>>>>> 1. When the active OVSDB server failed, should the back up server >>>>>>> take over, and allow write transactions? One simpler possibility is to >>>>>>> allow read only access to the backup serve. >>>>>>> >>>>>> >>>>>> The backup server needs to take over. It's OK if that requires >>>>>> intervention by an HA manager like Pacemaker. If we can't make the >>>>>> passive >>>>>> server take over, I'd say the solution is incomplete. >>>>>> >>>>> >>>>> O.K. make sense. >>>>> >>>>> One possible issue with backup server taking over is "split head". In >>>>> case due to network error, backup server becomes disconnected from the >>>>> active >>>>> server, then we may have both server thinking they are active server >>>>> now. Does Pacemaker help with solving this issue. >>>>> >>>> >>>> It can, yes. I would expect Pacemaker to explicitly configure a node >>>> to be either the active or passive node. >>>> >>> Manual switching is more straight forward. I agree. >>> >>>> >>>>>> >>>>>>> 2. When a crashed active OVSDB server recovers, should it become the >>>>>>> new backup, or it should switch back. >>>>>>> >>>>>> >>>>>> Becoming the new backup is fine. Again, this can be orchestrated by >>>>>> an HA manager (Pacemaker). >>>>>> >>>>> I am not familiar with pacemaker. Can I assume it can provide a >>>>> correct --sync-from argument (pointing to backup server) when relaunch >>>>> OVSDB server? >>>>> >>>> >>>> Yes. I'd have to consult with some Pacemaker experts on exactly what >>>> the implementation would look like, but roughly: >>>> >>>> Pacemaker manages services using "OCF Resource Agents", which are just >>>> scripts with a defined set of inputs and outputs for service management. I >>>> would imagine a Pacemaker cluster being told it must have exactly 1 active >>>> and 1 passive OVSDB service. When the passive OVSDB service is started, it >>>> would include the "sync-from" argument based on where the active OVSDB >>>> service is currently running. >>>> >>>> We really need to prototype this and document it. I'm guessing too >>>> much. Pacemaker is frequently used to manage active/passive HA, though. >>>> >>>> Sounds reasonable, I will work on ovsdb internal changes to support >>> manual switching, using appctl commands. Then looking into prototyping with >>> HA systems. I have not used pacemaker in the past, so it may take some >>> time to ramp up. >>> >> >> I should be able to help. We need to do this work anyway for integration >> into OpenStack deployment tools. Let me see if I can get some helpful >> examples to follow. >> > > Thanks for helping out. > > Given that, I now plan to work from bottom up, initially focusing on ovsdb > server changes. > > 1. Add a state in ovsdb-server for it to know whether it is an active > server. Backup server will not accept any connections. Server started with > --sync-from argument will be put in the back state by default. > > 2. Add appctl commands to allow manually switch state. > > 3. Add a new table for backup server to register its address and ports. > OVSDB clients can learn about them at run time. Back up server should issue > an > transaction to register its address before issuing the monitoring > request. This feature is not strictly necessary, and can be pushed to HA > manager, > but having it built into ovsdb-server may make it simpler for integrationl. > > What do you think? > > > Russell, Would HA manager also manage ovn-controller switch over? _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev