Hi Robin, The idea sounds good to me too. I am working on refactoring ServiceGroup code. Tooz has a nice compatibility matrix which can be found here [2] which you might find useful.
-Vilobh [1] Servicegroup code refactoring : https://review.openstack.org/#/c/172502/ [2] Tooz compatibility matrix : http://docs.openstack.org/developer/tooz/compatibility.html On Tue, Apr 14, 2015 at 6:07 AM, Wangbibo <wangb...@huawei.com> wrote: > Hi Kevin and Joshua, > > > > Thanks for the review. Glad to see that oslo puts distributed > coordination into its scope now. Per out of date info [1] (oslo doesn’t do > it, while each project should do it separately ), specific backend > (zk/memcached) manipulating is included in spec[2], as nova ServiceGroup > did. Now we have tooz, then that part should be moved out of AgentGroup > and let tooz take it over. Neutron AgentGroup spec needs an update, as what > nova ServiceGroup refactor is doing. [3] > > > > Per spec[3], tooz doesn’t intend to eliminate or replace ServiceGroup > completely. They are integrated and co-work to provide nova ServiceGroup > functionalities. That may answer the question from Kevin and Kyle, about > relationship between AgentGroup and tooz. Let’s jump into [3][4]: > > 1) Service Group still exists; > > 2) Add Tooz driver for ServiceGroup, to take over zk/redis/… backend; > > 3) Db-based ServiceGroup driver is retained. Db driver was > introduced for backward compatibility (with db-based liveness monitoring > which existed for a long time before ServiceGroup was added). Since this > driver uses tables and a data model that is intrinsically tied to the > internal of nova, tooz cannot take it over. > > 4) Zk/memcached ServiceGroup drivers are temporarily retained, but > will be deprecated in future; > > 5) Eventually, there would be two ServiceGroup drivers: db driver & > tooz driver; > > > > Actually, things are the same for neutron, except that we don’t need to > consider zk/memcached driver deprecation. I would like to refine current > spec and propose a ”Agent Group and using tooz” spec, following the > outlines above. What do you think, Kevin and Joshua? Thanks. J > > > > Best, > > Robin > > > > [1] https://wiki.openstack.org/wiki/NovaZooKeeperHeartbeat > > [2] https://review.openstack.org/#/c/168921/ > > [3] > https://review.openstack.org/#/c/138607/11/specs/liberty/approved/service-group-using-tooz.rst > > [4] ServiceGroup refactor code: https://review.openstack.org/#/c/172502/ > > > > > > > > > > > > *发件人:* Wangbibo [mailto:wangb...@huawei.com] > *发送时间:* 2015年4月13日 16:52 > *收件人:* OpenStack Development Mailing List (not for usage questions) > *主题:* [openstack-dev] 答复: [neutron] Neutron scaling datapoints? > > > > Hi Kevin, > > > > Totally agree with you that heartbeat from each agent is something that we > cannot eliminate currently. Agent status depends on it, and further > scheduler and HA depends on agent status. > > > > I proposed a Liberty spec for introducing open framework/pluggable agent > status drivers.[1][2] It allows us to use some other 3rd party backend > to monitor agent status, such as zookeeper, memcached. Meanwhile, it > guarantees backward compatibility so that users could still use db-based > status monitoring mechanism as their default choice. > > > > Base on that, we may do further optimization on issues Attila and you > mentioned. Thanks. > > > > [1] BP - > https://blueprints.launchpad.net/neutron/+spec/agent-group-and-status-drivers > > [2] Liberty Spec proposed - https://review.openstack.org/#/c/168921/ > > > > Best, > > Robin > > > > > > > > > > *发件人:* Kevin Benton [mailto:blak...@gmail.com <blak...@gmail.com>] > *发送时间:* 2015年4月11日 12:35 > *收件人:* OpenStack Development Mailing List (not for usage questions) > *主题:* Re: [openstack-dev] [neutron] Neutron scaling datapoints? > > > > Which periodic updates did you have in mind to eliminate? One of the few > remaining ones I can think of is sync_routers but it would be great if you > can enumerate the ones you observed because eliminating overhead in agents > is something I've been working on as well. > > > > One of the most common is the heartbeat from each agent. However, I don't > think we can't eliminate them because they are used to determine if the > agents are still alive for scheduling purposes. Did you have something else > in mind to determine if an agent is alive? > > > > On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas <afaze...@redhat.com> > wrote: > > I'm 99.9% sure, for scaling above 100k managed node, > we do not really need to split the openstack to multiple smaller openstack, > or use significant number of extra controller machine. > > The problem is openstack using the right tools SQL/AMQP/(zk), > but in a wrong way. > > For example.: > Periodic updates can be avoided almost in all cases > > The new data can be pushed to the agent just when it needed. > The agent can know when the AMQP connection become unreliable (queue or > connection loose), > and needs to do full sync. > https://bugs.launchpad.net/neutron/+bug/1438159 > > Also the agents when gets some notification, they start asking for details > via the > AMQP -> SQL. Why they do not know it already or get it with the > notification ? > > > ----- Original Message ----- > > From: "Neil Jerram" <neil.jer...@metaswitch.com> > > > To: "OpenStack Development Mailing List (not for usage questions)" < > openstack-dev@lists.openstack.org> > > Sent: Thursday, April 9, 2015 5:01:45 PM > > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints? > > > > Hi Joe, > > > > Many thanks for your reply! > > > > On 09/04/15 03:34, joehuang wrote: > > > Hi, Neil, > > > > > > From theoretic, Neutron is like a "broadcast" domain, for example, > > > enforcement of DVR and security group has to touch each regarding host > > > where there is VM of this project resides. Even using SDN controller, > the > > > "touch" to regarding host is inevitable. If there are plenty of > physical > > > hosts, for example, 10k, inside one Neutron, it's very hard to > overcome > > > the "broadcast storm" issue under concurrent operation, that's the > > > bottleneck for scalability of Neutron. > > > > I think I understand that in general terms - but can you be more > > specific about the broadcast storm? Is there one particular message > > exchange that involves broadcasting? Is it only from the server to > > agents, or are there 'broadcasts' in other directions as well? > > > > (I presume you are talking about control plane messages here, i.e. > > between Neutron components. Is that right? Obviously there can also be > > broadcast storm problems in the data plane - but I don't think that's > > what you are talking about here.) > > > > > We need layered architecture in Neutron to solve the "broadcast domain" > > > bottleneck of scalability. The test report from OpenStack cascading > shows > > > that through layered architecture "Neutron cascading", Neutron can > > > supports up to million level ports and 100k level physical hosts. You > can > > > find the report here: > > > > http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers > > > > Many thanks, I will take a look at this. > > > > > "Neutron cascading" also brings extra benefit: One cascading Neutron > can > > > have many cascaded Neutrons, and different cascaded Neutron can > leverage > > > different SDN controller, maybe one is ODL, the other one is > OpenContrail. > > > > > > ----------------Cascading Neutron------------------- > > > / \ > > > --cascaded Neutron-- --cascaded Neutron----- > > > | | > > > ---------ODL------ ----OpenContrail-------- > > > > > > > > > And furthermore, if using Neutron cascading in multiple data centers, > the > > > DCI controller (Data center inter-connection controller) can also be > used > > > under cascading Neutron, to provide NaaS ( network as a service ) > across > > > data centers. > > > > > > ---------------------------Cascading Neutron-------------------------- > > > / | \ > > > --cascaded Neutron-- -DCI controller- --cascaded Neutron----- > > > | | | > > > ---------ODL------ | ----OpenContrail-------- > > > | > > > --(Data center 1)-- --(DCI networking)-- --(Data center 2)-- > > > > > > Is it possible for us to discuss this in OpenStack Vancouver summit? > > > > Most certainly, yes. I will be there from mid Monday afternoon through > > to end Friday. But it will be my first summit, so I have no idea yet as > > to how I might run into you - please can you suggest! > > > > > Best Regards > > > Chaoyi Huang ( Joe Huang ) > > > > Regards, > > Neil > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > -- > > Kevin Benton > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev