What about ability of service expert to plug-in remediation module? If remediation action succeed - proceed, if not then stop. Remediation module can be extended independently from main flow. Thanks, Arkady
-----Original Message----- From: Steven Hardy [mailto:sha...@redhat.com] Sent: Wednesday, July 27, 2016 3:26 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [tripleo] service validation during deployment steps Hi Emilien, On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote: > I would love to hear some feedback about $topic, thanks. Sorry for the slow response, we did dicuss this on IRC, but providing that feedback and some other comments below: > On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi wrote: > > Hi, > > > > Some people on the field brought interesting feedback: > > > > "As a TripleO User, I would like the deployment to stop immediately > > after an resource creation failure during a step of the deployment > > and be able to easily understand what service or resource failed to > > be installed". > > > > Example: > > If during step4 Puppet tries to deploy Neutron and OVS, but OVS > > fails to start for some reasons, deployment should stop at the end > > of the step. I don't think anyone will argue against this use-case, we absolutely want to enable a better "fail fast" for deployment problems, as well as better surfacing of why it failed. > > So there are 2 things in this user story: > > > > 1) Be able to run some service validation within a step deployment. > > Note about the implementation: make the validation composable per > > service (OVS, nova, etc) and not per role (compute, controller, etc). +1, now we have composable services we need any validations to be associated with the services, not the roles. That said, it's fairly easy to imagine an interface like step_config/config_settings could be used to wire in composable service validations on a per-role basis, e.g similar to what we do here, but per-step: https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144 Similar to what was proposed (but never merged) here: https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml > > 2) Make this information readable and easy to access and understand > > for our users. > > > > I have a proof-of-concept for 1) and partially 2), with the example > > of > > OVS: https://review.openstack.org/#/c/342202/ > > This patch will make sure OVS is actually usable at step 4 by > > running 'ovs-vsctl show' during the Puppet catalog and if it's > > working, it will create a Puppet anchor. This anchor is currently > > not useful but could be in future if we want to rely on it for > > orchestration. > > I wrote the service validation in Puppet 2 years ago when doing > > Spinal Stack with eNovance: > > https://github.com/openstack/puppet-openstacklib/blob/master/manifes > > ts/service_validation.pp I think we could re-use it very easily, it > > has been proven to work. > > Also, the code is within our Puppet profiles, so it's by design > > composable and we don't need to make any connection with our current > > services with some magic. Validation will reside within Puppet > > manifests. > > If you look my PoC, this code could even live in puppet-vswitch > > itself (we already have this code for puppet-nova, and some others). I think having the validations inside the puppet implementation is OK, but ideally I think we do want it to be part of the puppet modules themselves (not part of the puppet-tripleo abstraction layer). The issue I'd have with putting it in puppet-tripleo is that if we're going to do this in a tripleo specific way, it should probably be done via a method that's more config tool agnostic. Otherwise we'll have to recreate the same validations for future implementations (I'm thinking specifically about containers here, and possibly ansible[1]. So, in summary, I'm +1 on getting this integrated if it can be done with little overhead and it's something we can leverage via the puppet modules vs puppet-tripleo. > > > > Ok now, what if validation fails? > > I'm testing it here: https://review.openstack.org/#/c/342205/ > > If you look at /var/log/messages, you'll see: > > > > Error: > > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Serv > > ice_validation[openvswitch]/Exec[execute > > openvswitch validation]/returns: change from notrun to 0 failed > > > > So it's pretty clear by looking at logs that openvswitch service > > validation failed and something is wrong. You'll also notice in the > > logs that deployed stopped at step 4 since OVS is not considered to > > run. > > It's partially addressing 2) because we need to make it more > > explicit and readable. Dan Prince had the idea to use > > https://github.com/ripienaar/puppet-reportprint to print a nice > > report of Puppet catalog result (we haven't tried it yet). We could > > also use Operational Tools later to monitor Puppet logs and find > > Service validation failures. This all sounds good, but we do need to think beyond the puppet implementation, e.g how will we enable similar validations in a container based deployment? I remember SpinalStack also used serverspec, can you describe the differences between using that tool (was it only used for post-deploy validation of the whole server, not per-step validation?) I'm just wondering if the overhead of integrating per-service validations via a more generic tool (not necessarily serverspec but something like it, e.g I've been looking at testinfra which is a more python based tool aiming to do similar things[2]) would be worth it? Maybe this is complementary to any per-step validation done inside the puppet modules, but we could do something like: outputs: role_data: description: Role data for the Heat Engine role. value: service_name: heat-api config_settings: heat::: foo step_config: | include ::tripleo::profile::base::heat::api validation: group: serverspec config: step_4: | Package "openstack-heat-api" should be installed Service "openstack-heat-api" should be enabled should be running Port "8004" should be listening Looking at the WIP container composable services patch[3], this can probably be directly reused: outputs: role_data: description: Role data for the Keystone API role. value: config_settings: step_config: puppet_tags: keystone_config docker_config: keystone: container_step_config: 1 image: list_join: - '/' - [ {get_param: DockerNamespace}, {get_param: DockerKeystoneImage} ] net: host privileged: false restart: always volumes: - /run:/run - /var/lib/etc-data/json-config/keystone.json:/var/lib/kolla/config_files/keystone.json environment: - KOLLA_CONFIG_STRATEGY=COPY_ALWAYS validation: group: serverspec config: step_1: | Service "httpd" should be enabled should be running Port "5000" should be listening Port "35357" should be listening Anyway, just some ideas there - I'm not opposed to what you suggest re the puppet validations, but I'm very aware that we'll be left with a feature gap if we *only* do that, then (pretty soon) enable fully containerized deployments. Thanks, Steve [1] http://lists.openstack.org/pipermail/openstack-dev/2016-July/099564.html [2] https://testinfra.readthedocs.io/en/latest/ [3] https://review.openstack.org/#/c/330659/12/docker/services/keystone.yaml > > > > > > So this email is a bootstrap of discussion, it's open for feedback. > > Don't take my PoC as something we'll implement. It's an idea and I > > think it's worth to look at it. > > I like it for 2 reasons: > > - the validation code reside within our profiles, so it's composable by > > design. > > - it's flexible and allow us to test everything. It can be a bash > > script, a shell command, a Puppet resource (provider, service, etc). > > > > Thanks for reading so far, > > -- > > Emilien Macchi > > > > -- > Emilien Macchi > > ______________________________________________________________________ > ____ OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Steve Hardy Red Hat Engineering, Cloud __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev