On 22/10/14 03:07, Andrew Laski wrote: > > On 10/21/2014 04:31 AM, Nikola Đipanov wrote: >> On 10/20/2014 08:00 PM, Andrew Laski wrote: >>> One of the big goals for the Kilo cycle by users and developers of the >>> cells functionality within Nova is to get it to a point where it can be >>> considered a first class citizen of Nova. Ultimately I think this comes >>> down to getting it tested by default in Nova jobs, and making it easy >>> for developers to work with. But there's a lot of work to get there. >>> In order to raise awareness of this effort, and get the conversation >>> started on a few things, I've summarized a little bit about cells and >>> this effort below. >>> >>> >>> Goals: >>> >>> Testing of a single cell setup in the gate. >>> Feature parity. >>> Make cells the default implementation. Developers write code once and >>> it works for cells. >>> >>> Ultimately the goal is to improve maintainability of a large feature >>> within the Nova code base. >>> >> Thanks for the write-up Andrew! Some thoughts/questions below. Looking >> forward to the discussion on some of these topics, and would be happy to >> review the code once we get to that point. >> >>> Feature gaps: >>> >>> Host aggregates >>> Security groups >>> Server groups >>> >>> >>> Shortcomings: >>> >>> Flavor syncing >>> This needs to be addressed now. >>> >>> Cells scheduling/rescheduling >>> Instances can not currently move between cells >>> These two won't affect the default one cell setup so they will be >>> addressed later. >>> >>> >>> What does cells do: >>> >>> Schedule an instance to a cell based on flavor slots available. >>> Proxy API requests to the proper cell. >>> Keep a copy of instance data at the global level for quick retrieval. >>> Sync data up from a child cell to keep the global level up to date. >>> >>> >>> Simplifying assumptions: >>> >>> Cells will be treated as a two level tree structure. >>> >> Are we thinking of making this official by removing code that actually >> allows cells to be an actual tree of depth N? I am not sure if doing so >> would be a win, although it does complicate the RPC/Messaging/State code >> a bit, but if it's not being used, even though a nice generalization, >> why keep it around? > > My preference would be to remove that code since I don't envision anyone > writing tests to ensure that functionality works and/or doesn't > regress. But there's the challenge of not knowing if anyone is actually > relying on that behavior. So initially I'm not creating a specific work > item to remove it. But I think it needs to be made clear that it's not > officially supported and may get removed unless a case is made for > keeping it and work is put into testing it.
While I agree that N is a bit interesting, I have seen N=3 in production [central API]-->[state/region1]-->[state/region DC1] \->[state/region DC2] -->[state/region2 DC] -->[state/region3 DC] -->[state/region4 DC] >> >>> Plan: >>> >>> Fix flavor breakage in child cell which causes boot tests to fail. >>> Currently the libvirt driver needs flavor.extra_specs which is not >>> synced to the child cell. Some options are to sync flavor and extra >>> specs to child cell db, or pass full data with the request. >>> https://review.openstack.org/#/c/126620/1 offers a means of passing full >>> data with the request. >>> >>> Determine proper switches to turn off Tempest tests for features that >>> don't work with the goal of getting a voting job. Once this is in place >>> we can move towards feature parity and work on internal refactorings. >>> >>> Work towards adding parity for host aggregates, security groups, and >>> server groups. They should be made to work in a single cell setup, but >>> the solution should not preclude them from being used in multiple >>> cells. There needs to be some discussion as to whether a host aggregate >>> or server group is a global concept or per cell concept. >>> >> Have there been any previous discussions on this topic? If so I'd really >> like to read up on those to make sure I understand the pros and cons >> before the summit session. > > The only discussion I'm aware of is some comments on > https://review.openstack.org/#/c/59101/ , though they mention a > discussion at the Utah mid-cycle. > > The main con I'm aware of for defining these as global concepts is that > there is no rescheduling capability in the cells scheduler. So if a > build is sent to a cell with a host aggregate that can't fit that > instance the build will fail even though there may be space in that host > aggregate from a global perspective. That should be somewhat > straightforward to address though. > > I think it makes sense to define these as global concepts. But these > are features that aren't used with cells yet so I haven't put a lot of > thought into potential arguments or cases for doing this one way or > another. > > >>> Work towards merging compute/api.py and compute/cells_api.py so that >>> developers only need to make changes/additions in once place. The goal >>> is for as much as possible to be hidden by the RPC layer, which will >>> determine whether a call goes to a compute/conductor/cell. >>> >>> For syncing data between cells, look at using objects to handle the >>> logic of writing data to the cell/parent and then syncing the data to >>> the other. >>> >> Some of that work has been done already, although in a somewhat ad-hoc >> fashion, were you thinking of extending objects to support this natively >> (whatever that means), or do we continue to inline the code in the >> existing object methods. > > I would prefer to have some native support for this. In general data is > considered authoritative at the global level or the cell level. For > example, instance data is synced down from the global level to a > cell(except for a few fields which are synced up) but a migration would > be synced up. I could imagine decorators that would specify how data > should be synced and handle that as transparently as possible. > >> >>> A potential migration scenario is to consider a non cells setup to be a >>> child cell and converting to cells will mean setting up a parent cell >>> and linking them. There are periodic tasks in place to sync data up >>> from a child already, but a manual kick off mechanism will need to be >>> added. >>> >>> >>> Future plans: >>> >>> Something that has been considered, but is out of scope for now, is that >>> the parent/api cell doesn't need the same data model as the child cell. >>> Since the majority of what it does is act as a cache for API requests, >>> it does not need all the data that a cell needs and what data it does >>> need could be stored in a form that's optimized for reads. >>> >>> >>> Thoughts? >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev