On 03/16/2016 11:53 AM, Vladimir Kuklin wrote: > Folks > > As I generally support the idea of getting rid of cluster status, this > requires thorough design. My opinion here is that we should leave it as > a function of nodes state until we come up with a variant of better > calculation of cluster status. Nevertheless it is true that cluster > status is actually a function of other primary data and should be > calculated on the client side. I suggest that we move towards more > fine-grained component-based architecture (simplest example is OpenStack > Fuel vs non-OpenStack Fuel) and figure out a way of calculating each > component's status. Then we should calculate each component's status and > then a cluster status should be an aggregate of those. For example, we > could say that the only components we have right now are nodes and the > aggregate is based on the nodes status and whether they are critical or not.
I believe the cluster status should be renamed to the deployment status. It has nothing to the real *cluster* status which is only may be figured out by LMA tools. > > On Tue, Mar 15, 2016 at 9:16 PM, Andrew Woodward <xar...@gmail.com > <mailto:xar...@gmail.com>> wrote: > > > > On Tue, Mar 15, 2016 at 4:04 AM Roman Prykhodchenko <m...@romcheg.me > <mailto:m...@romcheg.me>> wrote: > > Fuelers, > > I would like to continue the series of "Getting rid of …" > emails. This time I’d like to talk about statuses of clusters. > > The issues with that attribute is that it is not actually > related to real world very much and represents nothing. A few > month ago I proposed to make it more real-world-like [1] by > replacing a simple string by an aggregated value. However, after > task based deployment was introduced even that approach lost its > connection to the real world. > > My idea is to get rid of that attribute from a cluster and start > working with status of every single node in it. Nevertheless, we > only have tasks that are executed on nodes now, so we cannot > apply the "status" term to them. What if we replace that with a > sort of boolean value called maintenance_mode (or similar) that > we will use to tell if the node is operational or not. After > that we will be able to use an aggregated property for cluster > and check, if there are any nodes that are under a progress of > performing some tasks on them. > > > Yes, we still need an operations attribute, I'm not sure a bool is > enough, but you are quite correct, setting the status of the cluster > after operational == True based on the result of a specific node > failing, is in practice invalid. > > At the same time, operational == True is not necessarily deployment > succeeded, its more along the line of deployment validated, which > may be further testing passing like ostf, or more manual in the > operator wants to do more testing their own prior to changing the > state. > > As we adventure in to the LCM flow, we actually need status of each > component in addition of the general status of the cluster to > determine the proper course of action the on the next operation. > > For example nova-compute > if the cluster is not operational, then we can provision compute > nodes, and have them enabled, or active in the scheduler > automatically. However if the cluster is operational, a new compute > node must be disabled, or otherwise blocked from the default > scheduler until the node has received validation. In this case the > interpretation of operational is quite simple > > For example ceph > Here we care less about the status of the cluster (slightly, this > example ignores ceph's impact on nova-compute), and more about the > status of the service. In the case that we deploy ceph-osd's when > their are not replica factor osd hosts online (3) the we can > provision the OSD's similar to nova-compute, in that we can bring > them all online and active and data could be placed to them > immediately (more or less). but if the ceph status is operational, > then we have to take a different action, the OSD's have to be > brought in disabled, and gradually(probably by the operator) have > their data weight increased so they don't clog the network with data > peering which causes the clients may woes. > > > Thoughts, ideas? > > > References: > > 1. > https://blueprints.launchpad.net/fuel/+spec/complex-cluster-status > > > - romcheg > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- > > -- > > Andrew Woodward > > Mirantis > > Fuel Community Ambassador > > Ceph Community > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > -- > Yours Faithfully, > Vladimir Kuklin, > Fuel Library Tech Lead, > Mirantis, Inc. > +7 (495) 640-49-04 > +7 (926) 702-39-68 > Skype kuklinvv > 35bk3, Vorontsovskaya Str. > Moscow, Russia, > www.mirantis.com <http://www.mirantis.ru/> > www.mirantis.ru <http://www.mirantis.ru/> > vkuk...@mirantis.com <mailto:vkuk...@mirantis.com> > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards, Bogdan Dobrelya, Irc #bogdando __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev