> -----Original Message----- > From: Matt Riedemann [mailto:mrie...@linux.vnet.ibm.com] > Sent: 24 September 2015 16:59 > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [nova][cinder] how to handle AZ bug 1496235? > > > > On 9/24/2015 9:06 AM, Matt Riedemann wrote: > > > > > > On 9/24/2015 3:19 AM, Sylvain Bauza wrote: > >> > >> > >> Le 24/09/2015 09:04, Duncan Thomas a écrit : > >>> Hi > >>> > >>> I thought I was late on this thread, but looking at the time stamps, > >>> it is just something that escalated very quickly. I am honestly > >>> surprised an cross-project interaction option went from 'we don't > >>> seem to understand this' to 'deprecation merged' in 4 hours, with > >>> only a 12 hour discussion on the mailing list, right at the end of a > >>> cycle when we're supposed to be stabilising features. > >>> > >> > >> So, I agree it was maybe a bit too quick hence the revert. That said, > >> Nova master is now Mitaka, which means that the deprecation change > >> was provided for the next cycle, not the one currently stabilising. > >> > >> Anyway, I'm really all up with discussing why Cinder needs to know > >> the Nova AZs. > >> > >>> I proposed a session at the Tokyo summit for a discussion of Cinder > >>> AZs, since there was clear confusion about what they are intended > >>> for and how they should be configured. > >> > >> Cool, count me in from the Nova standpoint. > >> > >>> Since then I've reached out to and gotten good feedback from, a > >>> number of operators. There are two distinct configurations for AZ > >>> behaviour in cinder, and both sort-of worked until very recently. > >>> > >>> 1) No AZs in cinder > >>> This is the config where a single 'blob' of storage (most of the > >>> operators who responded so far are using Ceph, though that isn't > >>> required). The storage takes care of availability concerns, and any > >>> AZ info from nova should just be ignored. > >>> > >>> 2) Cinder AZs map to Nova AZs > >>> In this case, some combination of storage / networking / etc couples > >>> storage to nova AZs. It is may be that an AZ is used as a unit of > >>> scaling, or it could be a real storage failure domain. Eitehr way, > >>> there are a number of operators who have this configuration and want > >>> to keep it. Storage can certainly have a failure domain, and > >>> limiting the scalability problem of storage to a single cmpute AZ > >>> can have definite advantages in failure scenarios. These people do > >>> not want cross-az attach. > >>> > >> > >> Ahem, Nova AZs are not failure domains - I mean the current > >> implementation, in the sense of many people understand what is a > >> failure domain, ie. a physical unit of machines (a bay, a room, a > >> floor, a datacenter). > >> All the AZs in Nova share the same controlplane with the same message > >> queue and database, which means that one failure can be propagated to > >> the other AZ. > >> > >> To be honest, there is one very specific usecase where AZs *are* > >> failure domains : when cells exact match with AZs (ie. one AZ > >> grouping all the hosts behind one cell). That's the very specific > >> usecase that Sam is mentioning in his email, and I certainly understand we > need to keep that. > >> > >> What are AZs in Nova is pretty well explained in a quite old blogpost : > >> http://blog.russellbryant.net/2013/05/21/availability-zones-and-host- > >> aggregates-in-openstack-compute-nova/ > >> > >> > >> We also added a few comments in our developer doc here > >> http://docs.openstack.org/developer/nova/aggregates.html#availability > >> -zones-azs > >> > >> > >> tl;dr: AZs are aggregate metadata that makes those aggregates of > >> compute nodes visible to the users. Nothing more than that, no magic > sauce. > >> That's just a logical abstraction that can be mapping your physical > >> deployment, but like I said, which would share the same bus and DB. > >> Of course, you could still provide networks distinct between AZs but > >> that just gives you the L2 isolation, not the real failure domain in > >> a Business Continuity Plan way. > >> > >> What puzzles me is how Cinder is managing a datacenter-level of > >> isolation given there is no cells concept AFAIK. I assume that > >> cinder-volumes are belonging to a specific datacenter but how is > >> managed the controlplane of it ? I can certainly understand the need > >> of affinity placement between physical units, but I'm missing that > >> piece, and consequently I wonder why Nova need to provide AZs to > >> Cinder on a general case. > >> > >> > >> > >>> My hope at the summit session was to agree these two configurations, > >>> discuss any scenarios not covered by these two configuration, and > >>> nail down the changes we need to get these to work properly. There's > >>> definitely been interest and activity in the operator community in > >>> making nova and cinder AZs interact, and every desired interaction > >>> I've gotten details about so far matches one of the above models. > >>> > >> > >> I'm all with you about providing a way for users to get volume > >> affinity for Nova. That's a long story I'm trying to consider and we > >> are constantly trying to improve the nova scheduler interfaces so > >> that other projects could provide resources to the nova scheduler for > >> decision making. I just want to consider whether AZs are the best > >> concept for that or we should do thing by other ways (again, because > >> AZs are not what people expect). > >> > >> Again, count me in for the Cinder session, and just lemme know when > >> the session is planned so I could attend it. > >> > >> -Sylvain > >> > >> > >>> > >>> > __________________________________________________________ > __________ > >>> ______ > >>> > >>> OpenStack Development Mailing List (not for usage questions) > >>> Unsubscribe:OpenStack-dev- > requ...@lists.openstack.org?subject:unsubs > >>> cribe > >>> > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >> > >> > >> > >> > __________________________________________________________ > ___________ > >> _____ > >> > >> OpenStack Development Mailing List (not for usage questions) > >> Unsubscribe: > >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >> > > > > I plan on reverting the deprecation change (which was a mitaka change, > > not a liberty change, as Sylvain pointed out). > > > > However, given how many nova and cinder cores were talking about this > > yesterday and thought it was the right thing to do speaks to the fact > > that this is not a well understood use case (or documented at all). > > So as part of reverting the deprecation I also want to see improved > > docs for the cross_az_attach option itself and probably a nova devref > > change explaining the use cases and issues with this. > > > > I think the volume attach case is pretty straightforward. You create > > a nova instance in some nova AZ x and create a cinder volume in some > > cinder AZ y and try to attach the volume to the server instance. If > > cinder.cross_az_attach=True this is OK, else it fails. > > > > The problem I have is with the boot from volume case where > > source=(blank/image/snapshot). In those cases nova is creating the > > volume and passing the server instance AZ to the volume create API. > > How are people that are using cinder.cross_az_attach=False handling > > the BFV case? > > > > Per bug 1496235 that started this, the user is booting a nova instance > > in a nova AZ with bdm source=image and when nova tries to create the > > volume it fails because that AZ doesn't exist in cinder. This fails > > in the compute manager when building the instance, so this results in > > a NoValidHost error for the user - which we all know and love as a > > super useful error. So how do we handle this case? If > > cinder.cross_az_attach=True in nova we could just not pass the > > instance AZ to the volume create, or only pass it if cinder has that AZ > available. > > > > But if cinder.cross_az_attach=False when creating the volume, what do > > we do? I guess we can just leave the code as-is and if the AZ isn't > > in cinder (or your admin hasn't set > > allow_availability_zone_fallback=True > > in cinder.conf), then it fails and you open a support ticket. That > > seems gross to me. I'd like to at least see some of this validated in > > the nova API layer before it gets to the scheduler and compute so we > > can avoid NoValidHost. My thinking is, in the BFV case where source > > != volume, if cinder.cross_az_attach is False and instance.az is not > > None, then we check the list of AZs from the volume API. If the > > instance.az is not in that list, we fail fast (400 response to the > > user). However, if allow_availability_zone_fallback=True in > > cinder.conf, we'd be rejecting the request even though the actual > > volume create would succeed. These are just details that we don't > > have in the nova API since it's all policy driven gorp using config > > options that the user doesn't know about, which makes it really hard > > to write applications against this - and was part of the reason I moved to > deprecate that option. > > > > Am I off in the weeds? It sounds like Duncan is going to try and get > > a plan together in Tokyo about how to handle this and decouple nova > > and cinder in this case, which is the right long-term goal. > > > > Revert is approved: https://review.openstack.org/#/c/227340/ >
Matt, Thanks for reverting the change. Is there a process description for deprecating features ? It would be good to include - notification of operators (in operator's list) and agreed time to reply - documentation of workaround for those who are using a deprecated feature in production Thanks Tim > -- > > Thanks, > > Matt Riedemann > > > __________________________________________________________ > ________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: OpenStack-dev- > requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
smime.p7s
Description: S/MIME cryptographic signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev