I would be open to making this toggle switch available, however I feel that doing it via static configuration can introduce unnecessary burden to the operator. Perhaps we could explore a way where the agent can figure which state it's supposed to be in based on its reported status?
Armando On 5 November 2014 12:09, Salvatore Orlando <sorla...@nicira.com> wrote: > I have no opposition to that, and I will be happy to assist reviewing the > code that will enable flow synchronisation (or to say it in an easier way, > punctual removal of flows unknown to the l2 agent). > > In the meanwhile, I hope you won't mind if we go ahead and start making > flow reset optional - so that we stop causing downtime upon agent restart. > > Salvatore > > On 5 November 2014 11:57, Erik Moe <erik....@ericsson.com> wrote: > >> >> >> Hi, >> >> >> >> I also agree, IMHO we need flow synchronization method so we can avoid >> network downtime and stray flows. >> >> >> >> Regards, >> >> Erik >> >> >> >> >> >> *From:* Germy Lure [mailto:germy.l...@gmail.com] >> *Sent:* den 5 november 2014 10:46 >> *To:* OpenStack Development Mailing List (not for usage questions) >> *Subject:* Re: [openstack-dev] [neutron][TripleO] Clear all flows when >> ovs agent start? why and how avoid? >> >> >> >> Hi Salvatore, >> >> A startup flag is really a simpler approach. But in what situation we >> should set this flag to remove all flows? upgrade? restart manually? >> internal fault? >> >> >> >> Indeed, only at the time that there are inconsistent(incorrect, unwanted, >> stable and so on) flows between agent and the ovs related, we need refresh >> flows. But the problem is how we know this? I think a startup flag is too >> rough, unless we can tolerate the inconsistent situation. >> >> >> >> Of course, I believe that turn off startup reset flows action can resolve >> most problem. The flows are correct most time after all. But considering >> NFV 5 9s, I still recommend flow synchronization approach. >> >> >> >> BR, >> >> Germy >> >> >> >> On Wed, Nov 5, 2014 at 3:36 PM, Salvatore Orlando <sorla...@nicira.com> >> wrote: >> >> From what I gather from this thread and related bug report, the change >> introduced in the OVS agent is causing a data plane outage upon agent >> restart, which is not desirable in most cases. >> >> >> >> The rationale for the change that introduced this bug was, I believe, >> cleaning up stale flows on the OVS agent, which also makes some sense. >> >> >> >> Unless I'm missing something, I reckon the best way forward is actually >> quite straightforward; we might add a startup flag to reset all flows and >> not reset them by default. >> >> While I agree the "flow synchronisation" process proposed in the previous >> post is valuable too, I hope we might be able to fix this with a simpler >> approach. >> >> >> >> Salvatore >> >> >> >> On 5 November 2014 04:43, Germy Lure <germy.l...@gmail.com> wrote: >> >> Hi, >> >> >> >> Consider the triggering of restart agent, I think it's nothing but: >> >> 1). only restart agent >> >> 2). reboot the host that agent deployed on >> >> >> >> When the agent started, the ovs may: >> >> a.have all correct flows >> >> b.have nothing at all >> >> c.have partly correct flows, the others may need to be reprogrammed, >> deleted or added >> >> >> >> In any case, I think both user and developer would happy to see that the >> system recovery ASAP after agent restarting. The best is agent only push >> those incorrect flows, but keep the correct ones. This can ensure those >> business with correct flows working during agent starting. >> >> >> >> So, I suggest two solutions: >> >> 1.Agent gets all flows from ovs and compare with its local flows after >> restarting. And agent only corrects the different ones. >> >> 2.Adapt ovs and agent. Agent just push all(not remove) flows every time >> and ovs prepares two tables for flows switch(like RCU lock). >> >> >> >> 1 is recommended because of the 3rd vendors. >> >> >> >> BR, >> >> Germy >> >> >> >> >> >> On Fri, Oct 31, 2014 at 10:28 PM, Ben Nemec <openst...@nemebean.com> >> wrote: >> >> On 10/29/2014 10:17 AM, Kyle Mestery wrote: >> > On Wed, Oct 29, 2014 at 7:25 AM, Hly <henry4...@gmail.com> wrote: >> >> >> >> >> >> Sent from my iPad >> >> >> >> On 2014-10-29, at 下午8:01, Robert van Leeuwen < >> robert.vanleeu...@spilgames.com> wrote: >> >> >> >>>>> I find our current design is remove all flows then add flow by >> entry, this >> >>>>> will cause every network node will break off all tunnels between >> other >> >>>>> network node and all compute node. >> >>>> Perhaps a way around this would be to add a flag on agent startup >> >>>> which would have it skip reprogramming flows. This could be used for >> >>>> the upgrade case. >> >>> >> >>> I hit the same issue last week and filed a bug here: >> >>> https://bugs.launchpad.net/neutron/+bug/1383674 >> >>> >> >>> From an operators perspective this is VERY annoying since you also >> cannot push any config changes that requires/triggers a restart of the >> agent. >> >>> e.g. something simple like changing a log setting becomes a hassle. >> >>> I would prefer the default behaviour to be to not clear the flows or >> at the least an config option to disable it. >> >>> >> >> >> >> +1, we also suffered from this even when a very little patch is done >> >> >> > I'd really like to get some input from the tripleo folks, because they >> > were the ones who filed the original bug here and were hit by the >> > agent NOT reprogramming flows on agent restart. It does seem fairly >> > obvious that adding an option around this would be a good way forward, >> > however. >> >> Since nobody else has commented, I'll put in my two cents (though I >> might be overcharging you ;-). I've also added the TripleO tag to the >> subject, although with Summit coming up I don't know if that will help. >> >> Anyway, if the bug you're referring to is the one I think, then our >> issue was just with the flows not existing. I don't think we care >> whether they get reprogrammed on agent restart or not as long as they >> somehow come into existence at some point. >> >> It's possible I'm wrong about that, and probably the best person to talk >> to would be Robert Collins since I think he's the one who actually >> tracked down the problem in the first place. >> >> -Ben >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev