Re: [openstack-dev] [TripleO][heat] a small experiment with Ansible in TripleO

2014-08-11 Thread Clint Byrum
Excerpts from Steve Baker's message of 2014-08-10 15:33:26 -0700:
> On 02/08/14 04:07, Allison Randal wrote:
> > A few of us have been independently experimenting with Ansible as a
> > backend for TripleO, and have just decided to try experimenting
> > together. I've chatted with Robert, and he says that TripleO was always
> > intended to have pluggable backends (CM layer), and just never had
> > anyone interested in working on them. (I see it now, even in the early
> > docs and talks, I guess I just couldn't see the forest for the trees.)
> > So, the work is in line with the overall goals of the TripleO project.
> >
> > We're starting with a tiny scope, focused only on updating a running
> > TripleO deployment, so our first work is in:
> >
> > - Create an Ansible Dynamic Inventory plugin to extract metadata from Heat
> > - Improve/extend the Ansible nova_compute Cloud Module (or create a new
> > one), for Nova rebuild
> > - Develop a minimal handoff from Heat to Ansible, particularly focused
> > on the interactions between os-collect-config and Ansible
> >
> > We're merging our work in this repo, until we figure out where it should
> > live:
> >
> > https://github.com/allisonrandal/tripleo-ansible
> >
> > We've set ourselves one week as the first sanity-check to see whether
> > this idea is going anywhere, and we may scrap it all at that point. But,
> > it seems best to be totally transparent about the idea from the start,
> > so no-one is surprised later.
> >
> Having pluggable backends for configuration seems like a good idea, and
> Ansible is a great choice for the first alternative backend.
> 

TripleO is intended to be loosely coupled for many components, not just
in-instance configuration.

> However what this repo seems to be doing at the moment is bypassing heat
> to do a stack update, and I can only assume there is an eventual goal to
> not use heat at all for stack orchestration too.
>
>
> Granted, until blueprint update-failure-recovery lands[1] then doing a
> stack-update is about as much fun as russian roulette. But this effort
> is tactical rather than strategic, especially given TripleO's mission
> statement.
> 

We intend to stay modular. Ansible won't replace Heat from end to end.

Right now we're stuck with an update that just doesn't work. It isn't
just about update-failure-recovery, which is coming along nicely, but
it is also about the lack of signals to control rebuild, poor support
for addressing machines as groups, and unacceptable performance in
large stacks.

We remain committed to driving these things into Heat, which will allow
us to address these things the way a large scale operation will need to.

But until we can land those things in Heat, we need something more
flexible like Ansible to go around Heat and do things in the exact
order we need them done. Ansible doesn't have a REST API, which is a
non-starter for modern automation, but the need to control workflow is
greater than the need to have a REST API at this point.

> If I were to use Ansible for TripleO configuration I would start with
> something like the following:
> * Install an ansible software-config hook onto the image to be triggered
> by os-refresh-config[2][3]
> * Incrementally replace StructuredConfig resources in
> tripleo-heat-templates with SoftwareConfig resources that include the
> ansible playbooks via get_file
> * The above can start in a fork of tripleo-heat-templates, but can
> eventually be structured using resource providers so that the deployer
> chooses what configuration backend to use by selecting the environment
> file that contains the appropriate config resources
> 
> Now you have a cloud orchestrated by heat and configured by Ansible. If
> it is still deemed necessary to do an out-of-band update to the stack
> then you're in a much better position to do an ansible push, since you
> can use the same playbook files that heat used to bring up the stack.
> 

That would be a good plan if we wanted to fix issues with os-*-config,
but that is the opposite of reality. We are working around Heat
orchestration issues with Ansible.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][heat] a small experiment with Ansible in TripleO

2014-08-11 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-08-11 08:16:56 -0700:
> On 11/08/14 10:46, Clint Byrum wrote:
> > Right now we're stuck with an update that just doesn't work. It isn't
> > just about update-failure-recovery, which is coming along nicely, but
> > it is also about the lack of signals to control rebuild, poor support
> > for addressing machines as groups, and unacceptable performance in
> > large stacks.
> 
> Are there blueprints/bugs filed for all of these issues?
> 

Convergnce addresses the poor performance for large stacks in general.
We also have this:

https://bugs.launchpad.net/heat/+bug/1306743

Which shows how slow metadata access can get. I have worked on patches
but haven't been able to complete them. We made big strides but we are
at a point where 40 nodes polling Heat every 30s is too much for one CPU
to handle. When we scaled Heat out onto more CPUs on one box by forking
we ran into eventlet issues. We also ran into issues because even with
many processes we can only use one to resolve templates for a single
stack during update, which was also excessively slow.

We haven't been able to come back around to those yet, but you can see
where this has turned into a bit of a rat hole of optimization.

action-aware-sw-config is sort of what we want for rebuild. We
collaborated with the trove devs on how to also address it for resize
a while back but I have lost track of that work as it has taken a back
seat to more pressing issues.

Addressing groups is a general problem that I've had a hard time
articulating in the past. Tomas Sedovic has done a good job with this
TripleO spec, but I don't know that we've asked for an explicit change
in a bug or spec in Heat just yet:

https://review.openstack.org/#/c/97939/

There are a number of other issues noted in that spec which are already
addressed in Heat, but require refactoring in TripleO's templates and
tools, and that work continues.

The point remains: we need something that works now, and doing an
alternate implementation for updates is actually faster than addressing
all of these issues.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][heat] a small experiment with Ansible in TripleO

2014-08-11 Thread Clint Byrum
Excerpts from Steven Hardy's message of 2014-08-11 11:40:07 -0700:
> On Mon, Aug 11, 2014 at 11:20:50AM -0700, Clint Byrum wrote:
> > Excerpts from Zane Bitter's message of 2014-08-11 08:16:56 -0700:
> > > On 11/08/14 10:46, Clint Byrum wrote:
> > > > Right now we're stuck with an update that just doesn't work. It isn't
> > > > just about update-failure-recovery, which is coming along nicely, but
> > > > it is also about the lack of signals to control rebuild, poor support
> > > > for addressing machines as groups, and unacceptable performance in
> > > > large stacks.
> > > 
> > > Are there blueprints/bugs filed for all of these issues?
> > > 
> > 
> > Convergnce addresses the poor performance for large stacks in general.
> > We also have this:
> > 
> > https://bugs.launchpad.net/heat/+bug/1306743
> > 
> > Which shows how slow metadata access can get. I have worked on patches
> > but haven't been able to complete them. We made big strides but we are
> > at a point where 40 nodes polling Heat every 30s is too much for one CPU
> > to handle. When we scaled Heat out onto more CPUs on one box by forking
> > we ran into eventlet issues. We also ran into issues because even with
> > many processes we can only use one to resolve templates for a single
> > stack during update, which was also excessively slow.
> 
> Related to this, and a discussion we had recently at the TripleO meetup is
> this spec I raised today:
> 
> https://review.openstack.org/#/c/113296/
> 
> It's following up on the idea that we could potentially address (or at
> least mitigate, pending the fully convergence-ified heat) some of these
> scalability concerns, if TripleO moves from the one-giant-template model
> to a more modular nested-stack/provider model (e.g what Tomas has been
> working on)
> 
> I've not got into enough detail on that yet to be sure if it's acheivable
> for Juno, but it seems initially to be complex-but-doable.
> 
> I'd welcome feedback on that idea and how it may fit in with the more
> granular convergence-engine model.
> 
> Can you link to the eventlet/forking issues bug please?  I thought since
> bug #1321303 was fixed that multiple engines and multiple workers should
> work OK, and obviously that being true is a precondition to expending
> significant effort on the nested stack decoupling plan above.
> 

That was the issue. So we fixed that bug, but we never un-reverted
the patch that forks enough engines to use up all the CPU's on a box
by default. That would likely help a lot with metadata access speed
(we could manually do it in TripleO but we tend to push defaults. :)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][heat] a small experiment with Ansible in TripleO

2014-08-11 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-08-11 13:35:44 -0700:
> On 11/08/14 14:49, Clint Byrum wrote:
> > Excerpts from Steven Hardy's message of 2014-08-11 11:40:07 -0700:
> >> On Mon, Aug 11, 2014 at 11:20:50AM -0700, Clint Byrum wrote:
> >>> Excerpts from Zane Bitter's message of 2014-08-11 08:16:56 -0700:
> >>>> On 11/08/14 10:46, Clint Byrum wrote:
> >>>>> Right now we're stuck with an update that just doesn't work. It isn't
> >>>>> just about update-failure-recovery, which is coming along nicely, but
> >>>>> it is also about the lack of signals to control rebuild, poor support
> >>>>> for addressing machines as groups, and unacceptable performance in
> >>>>> large stacks.
> >>>>
> >>>> Are there blueprints/bugs filed for all of these issues?
> >>>>
> >>>
> >>> Convergnce addresses the poor performance for large stacks in general.
> >>> We also have this:
> >>>
> >>> https://bugs.launchpad.net/heat/+bug/1306743
> >>>
> >>> Which shows how slow metadata access can get. I have worked on patches
> >>> but haven't been able to complete them. We made big strides but we are
> >>> at a point where 40 nodes polling Heat every 30s is too much for one CPU
> 
> This sounds like the same figure I heard at the design summit; did the 
> DB call optimisation work that Steve Baker did immediately after that 
> not have any effect?
> 

Steve's work got us to 40. From 7.

> >>> to handle. When we scaled Heat out onto more CPUs on one box by forking
> >>> we ran into eventlet issues. We also ran into issues because even with
> >>> many processes we can only use one to resolve templates for a single
> >>> stack during update, which was also excessively slow.
> >>
> >> Related to this, and a discussion we had recently at the TripleO meetup is
> >> this spec I raised today:
> >>
> >> https://review.openstack.org/#/c/113296/
> >>
> >> It's following up on the idea that we could potentially address (or at
> >> least mitigate, pending the fully convergence-ified heat) some of these
> >> scalability concerns, if TripleO moves from the one-giant-template model
> >> to a more modular nested-stack/provider model (e.g what Tomas has been
> >> working on)
> >>
> >> I've not got into enough detail on that yet to be sure if it's acheivable
> >> for Juno, but it seems initially to be complex-but-doable.
> >>
> >> I'd welcome feedback on that idea and how it may fit in with the more
> >> granular convergence-engine model.
> >>
> >> Can you link to the eventlet/forking issues bug please?  I thought since
> >> bug #1321303 was fixed that multiple engines and multiple workers should
> >> work OK, and obviously that being true is a precondition to expending
> >> significant effort on the nested stack decoupling plan above.
> >>
> >
> > That was the issue. So we fixed that bug, but we never un-reverted
> > the patch that forks enough engines to use up all the CPU's on a box
> > by default. That would likely help a lot with metadata access speed
> > (we could manually do it in TripleO but we tend to push defaults. :)
> 
> Right, and we decided we wouldn't because it's wrong to do that to 
> people by default. In some cases the optimal running configuration for 
> TripleO will differ from the friendliest out-of-the-box configuration 
> for Heat users in general, and in those cases - of which this is one - 
> TripleO will need to specify the configuration.
> 

Whether or not the default should be to fork 1 process per CPU is a
debate for another time. The point is, we can safely use the forking in
Heat now to perhaps improve performance of metadata polling.

Chasing that, and other optimizations, has not led us to a place where
we can get to, say, 100 real nodes _today_. We're chasing another way to
get to the scale and capability we need _today_, in much the same way
we did with merge.py. We'll find the way to get it done more elegantly
as time permits.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Clint Byrum
Excerpts from Thierry Carrez's message of 2014-08-13 02:54:58 -0700:
> Rochelle.RochelleGrober wrote:
> > [...]
> > So, with all that prologue, here is what I propose (and please consider 
> > proposing your improvements/changes to it).  I would like to see for Kilo:
> > 
> > - IRC meetings and mailing list meetings beginning with Juno release and 
> > continuing through the summit that focus on core project needs (what 
> > Thierry call "strategic") that as a set would be considered the primary 
> > focus of the Kilo release for each project.  This could include high 
> > priority bugs, refactoring projects, small improvement projects, high 
> > interest extensions and new features, specs that didn't make it into Juno, 
> > etc.
> > - Develop the list and prioritize it into "Needs" and "Wants." Consider 
> > these the feeder projects for the two runways if you like.  
> > - Discuss the lists.  Maybe have a community vote? The vote will "freeze" 
> > the list, but as in most development project freezes, it can be a soft 
> > freeze that the core, or drivers or TC can amend (or throw out for that 
> > matter).
> > [...]
> 
> One thing we've been unable to do so far is to set "release goals" at
> the beginning of a release cycle and stick to those. It used to be
> because we were so fast moving that new awesome stuff was proposed
> mid-cycle and ends up being a key feature (sometimes THE key feature)
> for the project. Now it's because there is so much proposed noone knows
> what will actually get completed.
> 
> So while I agree that what you propose is the ultimate solution (and the
> workflow I've pushed PTLs to follow every single OpenStack release so
> far), we have struggled to have the visibility, long-term thinking and
> discipline to stick to it in the past. If you look at the post-summit
> plans and compare to what we end up in a release, you'll see quite a lot
> of differences :)
> 

I think that shows agility, and isn't actually "a problem". 6 months
is quite a long time in the future for some business models. Strategic
improvements for the project should be able to stick to a 6 month
schedule, but companies will likely be tactical about where their
developer resources are directed for feature work.

The fact that those resources land code upstream is one of the greatest
strengths of OpenStack. Any potential impact on how that happens should
be carefully considered when making any changes to process and
governance.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] fix poor tarball support in source-repositories

2014-08-15 Thread Clint Byrum
Excerpts from Brownell, Jonathan C (Corvallis)'s message of 2014-08-15 08:11:18 
-0700:
> The current DIB element support for downloading tarballs via 
> "source-repository" allows an entry in the following form:
> 
>  tar  
> 
> Today, this feature is currently used only by the mysql DIB element. You can 
> see how it's used here:
> https://github.com/openstack/tripleo-image-elements/blob/master/elements/mysql/source-repository-mysql
> 
> However, the underlying diskimage-builder implementation of tarball handling 
> is rather odd and inflexible. After downloading the file (or retrieving from 
> cache) and unpacking into a tmp directory, it performs:
> 
> mv $tmp/*/* $targetdir
> 
> This does work as long as the tarball follows a structure where all its 
> files/directories are contained within a single directory, but it fails if 
> the tarball contains no subdirectories. (Even worse is when it contains some 
> files and some subdirectories, in which case the files are lost and the 
> contents of all subdirs get lumped together in the output folder.)
> 
> Since this tarball support is only used today by the mysql DIB element, I 
> would love to fix this in both diskimage-builder and tripleo-image-element by 
> changing to simply:
> 
> mv $tmp/* $targetdir
> 
> And then manually tweaking the directory structure of $targetdir from a new 
> install.d script in the mysql element to restore the desired layout.
> 
> However, it's important to note that this will break backwards compatibility 
> if tarball support is used in its current fashion by users with private DIB 
> elements.
> 
> Personally, I consider the current behavior so egregious that it really needs 
> to be fixed across the board rather than preserving backwards compatibility.
> 
> Do others agree? If not, do you have suggestions as to how to improve this 
> mechanism cleanly without sacrificing backwards compatibility?
> 

How about we make a glob to use, so like this:

mysql tar /usr/local/mysql http://someplace/mysql.tar.gz mysql-5.*

That would result in

mv $tmp/mysql-5.*/* $targetdir

And then we would warn that assuming the glob will be '*' is deprecated,
to be changed in a later release.

Users who want your proposed behavior would use . until the default
changes. That would result in

mv $tmp/./* $targetdir

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] fix poor tarball support in source-repositories

2014-08-16 Thread Clint Byrum
Excerpts from Jyoti Ranjan's message of 2014-08-16 00:57:52 -0700:
> We will have to be little bit cautious in using glob because of its
> inherent usage pattern. For e.g. the file starting with . will not get
> matched.
> 

That is a separate bug, but I think the answer to that is to use rsync
instead of mv and globs. So this:

mv $tmp/./* $destdir

becomes this:

rsync --remove-source-files $tmp/. $destdir

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Time to Samba! :-)

2014-08-16 Thread Clint Byrum
Excerpts from Martinx - ジェームズ's message of 2014-08-16 12:03:20 -0700:
> Hey Stackers,
> 
>  I'm wondering here... Samba4 is pretty solid (up coming 4.2 rocks), I'm
> using it on a daily basis as an AD DC controller, for both Windows and
> Linux Instances! With replication, file system ACLs - cifs, built-in LDAP,
> dynamic DNS with Bind9 as a backend (no netbios) and etc... Pretty cool!
> 
>  In OpenStack ecosystem, there are awesome solutions like Trove, Solum,
> Designate and etc... Amazing times BTW! So, why not try to integrate
> Samba4, working as an AD DC, within OpenStack itself?!
> 

But, if we did that, what would be left for us to reinvent in our own
slightly different way?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-17 Thread Clint Byrum
Here's why folk are questioning Ceilometer:

Nova is a set of tools to abstract virtualization implementations.
Neutron is a set of tools to abstract SDN/NFV implementations.
Cinder is a set of tools to abstract block-device implementations.
Trove is a set of tools to simplify consumption of existing databases.
Sahara is a set of tools to simplify Hadoop consumption.
Swift is a feature-complete implementation of object storage, none of
which existed when it was started.
Keystone supports all of the above, unifying their auth.
Horizon supports all of the above, unifying their GUI.

Ceilometer is a complete implementation of data collection and alerting.
There is no shortage of implementations that exist already.

I'm also core on two projects that are getting some push back these
days:

Heat is a complete implementation of orchestration. There are at least a
few of these already in existence, though not as many as their are data
collection and alerting systems.

TripleO is an attempt to deploy OpenStack using tools that OpenStack
provides. There are already quite a few other tools that _can_ deploy
OpenStack, so it stands to reason that people will question why we
don't just use those. It is my hope we'll push more into the "unifying
the implementations" space and withdraw a bit from the "implementing
stuff" space.

So, you see, people are happy to unify around a single abstraction, but
not so much around a brand new implementation of things that already
exist.

Excerpts from Nadya Privalova's message of 2014-08-17 11:11:34 -0700:
> Hello all,
> 
> As a Ceilometer's core, I'd like to add my 0.02$.
> 
> During previous discussions it was mentioned several projects which were
> started or continue to be developed after Ceilometer became integrated. The
> main question I'm thinking of is why it was impossible to contribute into
> existing integrated project? Is it because of Ceilometer's architecture,
> the team or there are some other (maybe political) reasons? I think it's a
> very sad situation when we have 3-4 Ceilometer-like projects from different
> companies instead of the only one that satisfies everybody. (We don't see
> it in other projects. Though, maybe there are several Novas os Neutrons on
> StackForge and I don't know about it...)
> Of course, sometimes it's much easier to start the project from scratch.
> But there should be strong reasons for doing this if we are talking about
> integrated project.
> IMHO the idea, the role is the most important thing when we are talking
> about integrated project. And if Ceilometer's role is really needed (and I
> think it is) then we should improve existing implementation, "merge" all
> needs into the one project and the result will be still Ceilometer.
> 
> Thanks,
> Nadya
> 
> On Fri, Aug 15, 2014 at 12:41 AM, Joe Gordon  wrote:
> 
> >
> >
> >
> > On Wed, Aug 13, 2014 at 12:24 PM, Doug Hellmann 
> > wrote:
> >
> >>
> >> On Aug 13, 2014, at 3:05 PM, Eoghan Glynn  wrote:
> >>
> >> >
> >> >>> At the end of the day, that's probably going to mean saying No to more
> >> >>> things. Everytime I turn around everyone wants the TC to say No to
> >> >>> things, just not to their particular thing. :) Which is human nature.
> >> >>> But I think if we don't start saying No to more things we're going to
> >> >>> end up with a pile of mud that no one is happy with.
> >> >>
> >> >> That we're being so abstract about all of this is frustrating. I get
> >> >> that no-one wants to start a flamewar, but can someone be concrete
> >> about
> >> >> what they feel we should say 'no' to but are likely to say 'yes' to?
> >> >>
> >> >>
> >> >> I'll bite, but please note this is a strawman.
> >> >>
> >> >> No:
> >> >> * Accepting any more projects into incubation until we are comfortable
> >> with
> >> >> the state of things again
> >> >> * Marconi
> >> >> * Ceilometer
> >> >
> >> > Well -1 to that, obviously, from me.
> >> >
> >> > Ceilometer is on track to fully execute on the gap analysis coverage
> >> > plan agreed with the TC at the outset of this cycle, and has an active
> >> > plan in progress to address architectural debt.
> >>
> >> Yes, there seems to be an attitude among several people in the community
> >> that the Ceilometer team denies that there are issues and refuses to work
> >> on them. Neither of those things is the case from our perspective.
> >>
> >
> > Totally agree.
> >
> >
> >>
> >> Can you be more specific about the shortcomings you see in the project
> >> that aren’t being addressed?
> >>
> >
> >
> > Once again, this is just a strawman.
> >
> > I'm just not sure OpenStack has 'blessed' the best solution out there.
> >
> >
> > https://wiki.openstack.org/wiki/Ceilometer/Graduation#Why_we_think_we.27re_ready
> >
> > "
> >
> >- Successfully passed the challenge of being adopted by 3 related
> >projects which have agreed to join or use ceilometer:
> >   - Synaps
> >   - Healthnmon
> >   - StackTach
> >   
> > 

Re: [openstack-dev] [all] The future of the integrated release

2014-08-20 Thread Clint Byrum
Excerpts from Robert Collins's message of 2014-08-18 23:41:20 -0700:
> On 18 August 2014 09:32, Clint Byrum  wrote:
> 
> I can see your perspective but I don't think its internally consistent...
> 
> > Here's why folk are questioning Ceilometer:
> >
> > Nova is a set of tools to abstract virtualization implementations.
> 
> With a big chunk of local things - local image storage (now in
> glance), scheduling, rebalancing, ACLs and quotas. Other
> implementations that abstract over VM's at various layers already
> existed when Nova started - some bad ( some very bad!) and others
> actually quite ok.
> 

The fact that we have local implementations of domain specific things is
irrelevant to the difference I'm trying to point out. Glance needs to
work with the same authentication semantics and share a common access
catalog to work well with Nova. It's unlikely there's a generic image
catalog that would ever fit this bill. In many ways glance is just an
abstraction of file storage backends and a database to track a certain
domain of files (images, and soon, templates and other such things).

The point of mentioning Nova is, we didn't write libvirt, or xen, we
wrote an abstraction so that users could consume them via a REST API
that shares these useful automated backends like glance.

> > Neutron is a set of tools to abstract SDN/NFV implementations.
> 
> And implements a DHCP service, DNS service, overlay networking : its
> much more than an abstraction-over-other-implementations.
> 

Native DHCP and overlay? Last I checked Neutron used dnsmasq and
openvswitch, but it has been a few months, and I know that is an eon in
OpenStack time.

> > Cinder is a set of tools to abstract block-device implementations.
> > Trove is a set of tools to simplify consumption of existing databases.
> > Sahara is a set of tools to simplify Hadoop consumption.
> > Swift is a feature-complete implementation of object storage, none of
> > which existed when it was started.
> 
> Swift was started in 2009; Eucalyptus goes back to 2007, with Walrus
> part of that - I haven't checked precise dates, but I'm pretty sure
> that it existed and was usable by the start of 2009. There may well be
> other object storage implementations too - I simply haven't checked.
> 

Indeed, and MogileFS was sort of like Swift but not HTTP based. Perhaps
Walrus was evaluated and inadequate for the CloudFiles product
requirements? I don't know. But there weren't de-facto object stores
at the time because object stores were just becoming popular.

> > Keystone supports all of the above, unifying their auth.
> 
> And implementing an IdP (which I know they want to stop doing ;)). And
> in fact lots of OpenStack projects, for various reasons support *not*
> using Keystone (something that bugs me, but thats a different
> discussion).
> 

My point was it is justified to have a whole implementation and not
just abstraction because it is meant to enable the ecosystem, not _be_
the ecosystem. I actually think Keystone is problematic too, and I often
wonder why we haven't just do OAuth, but I'm not trying to throw every
project under the bus. I'm trying to state that we accept Keystone because
it has grown organically to support the needs of all the other pieces.

> > Horizon supports all of the above, unifying their GUI.
> >
> > Ceilometer is a complete implementation of data collection and alerting.
> > There is no shortage of implementations that exist already.
> >
> > I'm also core on two projects that are getting some push back these
> > days:
> >
> > Heat is a complete implementation of orchestration. There are at least a
> > few of these already in existence, though not as many as their are data
> > collection and alerting systems.
> >
> > TripleO is an attempt to deploy OpenStack using tools that OpenStack
> > provides. There are already quite a few other tools that _can_ deploy
> > OpenStack, so it stands to reason that people will question why we
> > don't just use those. It is my hope we'll push more into the "unifying
> > the implementations" space and withdraw a bit from the "implementing
> > stuff" space.
> >
> > So, you see, people are happy to unify around a single abstraction, but
> > not so much around a brand new implementation of things that already
> > exist.
> 
> If the other examples we had were a lot purer, this explanation would
> make sense. I think there's more to it than that though :).
> 

If purity is required to show a difference, then I don't think I know
how to demonstrate what I think is obvious to most of us: Ceilometer
is an end 

Re: [openstack-dev] [all] The future of the integrated release

2014-08-20 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2014-08-20 14:53:22 -0700:
> On 08/20/2014 05:06 PM, Chris Friesen wrote:
> > On 08/20/2014 07:21 AM, Jay Pipes wrote:
> >> Hi Thierry, thanks for the reply. Comments inline. :)
> >>
> >> On 08/20/2014 06:32 AM, Thierry Carrez wrote:
> >>> If we want to follow your model, we probably would have to dissolve
> >>> programs as they stand right now, and have blessed categories on one
> >>> side, and teams on the other (with projects from some teams being
> >>> blessed as the current solution).
> >>
> >> Why do we have to have "blessed" categories at all? I'd like to think of
> >> a day when the TC isn't picking winners or losers at all. Level the
> >> playing field and let the quality of the projects themselves determine
> >> the winner in the space. Stop the incubation and graduation madness and
> >> change the role of the TC to instead play an advisory role to upcoming
> >> (and existing!) projects on the best ways to integrate with other
> >> OpenStack projects, if integration is something that is natural for the
> >> project to work towards.
> >
> > It seems to me that at some point you need to have a recommended way of
> > doing things, otherwise it's going to be *really hard* for someone to
> > bring up an OpenStack installation.
> 
> Why can't there be multiple recommended ways of setting up an OpenStack 
> installation? Matter of fact, in reality, there already are multiple 
> recommended ways of setting up an OpenStack installation, aren't there?
> 
> There's multiple distributions of OpenStack, multiple ways of doing 
> bare-metal deployment, multiple ways of deploying different message 
> queues and DBs, multiple ways of establishing networking, multiple open 
> and proprietary monitoring systems to choose from, etc. And I don't 
> really see anything wrong with that.
> 

This is an argument for loosely coupling things, rather than tightly
integrating things. You will almost always win my vote with that sort of
movement, and you have here. +1.

> > We already run into issues with something as basic as competing SQL
> > databases.
> 
> If the TC suddenly said "Only MySQL will be supported", that would not 
> mean that the greater OpenStack community would be served better. It 
> would just unnecessarily take options away from deployers.
> 

This is really where "supported" becomes the mutex binding us all. The
more "supported" options, the larger the matrix, the more complex a
user's decision process becomes.

>  > If every component has several competing implementations and
> > none of them are "official" how many more interaction issues are going
> > to trip us up?
> 
> IMO, OpenStack should be about choice. Choice of hypervisor, choice of 
> DB and MQ infrastructure, choice of operating systems, choice of storage 
> vendors, choice of networking vendors.
> 

Err, uh. I think OpenStack should be about users. If having 400 choices
means users are just confused, then OpenStack becomes nothing and
everything all at once. Choices should be part of the whole not when 1%
of the market wants a choice, but when 20%+ of the market _requires_
a choice.

What we shouldn't do is harm that 1%'s ability to be successful. We should
foster it and help it grow, but we don't just pull it into the program and
say "You're ALSO in OpenStack now!" and we also don't want to force those
users to make a hard choice because the better solution is not blessed.

> If there are multiple actively-developed projects that address the same 
> problem space, I think it serves our OpenStack users best to let the 
> projects work things out themselves and let the cream rise to the top. 
> If the cream ends up being one of those projects, so be it. If the cream 
> ends up being a mix of both projects, so be it. The production community 
> will end up determining what that cream should be based on what it 
> deploys into its clouds and what input it supplies to the teams working 
> on competing implementations.
> 

I'm really not a fan of making it a competitive market. If a space has a
diverse set of problems, we can expect it will have a diverse set of
solutions that overlap. But that doesn't mean they both need to drive
toward making that overlap all-encompassing. Sometimes that happens and
it is good, and sometimes that happens and it causes horrible bloat.

> And who knows... what works or is recommended by one deployer may not be 
> what is best for another type of deployer and I believe we (the 
> TC/governance) do a disservice to our user community by picking a winner 
> in a space too early (or continuing to pick a winner in a clearly 
> unsettled space).
> 

Right, I think our current situation crowds out diversity, when what we
want to do is enable it, without confusing the users.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-21 Thread Clint Byrum
Excerpts from Duncan Thomas's message of 2014-08-21 09:21:06 -0700:
> On 21 August 2014 14:27, Jay Pipes  wrote:
> 
> > Specifically for Triple-O, by making the Deployment program == Triple-O, the
> > TC has picked the disk-image-based deployment of an undercloud design as The
> > OpenStack Way of Deployment. And as I've said previously in this thread, I
> > believe that the deployment space is similarly unsettled, and that it would
> > be more appropriate to let the Chef cookbooks and Puppet modules currently
> > sitting in the stackforge/ code namespace live in the openstack/ code
> > namespace.
> 
> Totally agree with Jay here, I know people who gave up on trying to
> get any official project around deployment because they were told they
> had to do it under the TripleO umbrella
> 

This was why the _program_ versus _project_ distinction was made. But
I think we ended up being 1:1 anyway.

Perhaps the deployment program's mission statement is too narrow, and
we should iterate on that. That others took their ball and went home,
instead of asking for a review of that ruling, is a bit disconcerting.

That probably strikes to the heart of the current crisis. If we were
being reasonable, alternatives to an official OpenStack program's mission
statement would be debated and considered thoughtfully. I know I made the
mistake early on of pushing the narrow _TripleO_ vision into what should
have been a much broader "Deployment" program. I'm not entirely sure why
that seemed o-k to me at the time, or why it was allowed to continue, but
I think it may be a good exercise to review those events and try to come
up with a few theories or even conclusions as to what we could do better.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-21 Thread Clint Byrum
Excerpts from David Kranz's message of 2014-08-21 12:45:05 -0700:
> On 08/21/2014 02:39 PM, gordon chung wrote:
> > > The point I've been making is
> > > that by the TC continuing to bless only the Ceilometer project as the
> > > OpenStack Way of Metering, I think we do a disservice to our users by
> > > picking a winner in a space that is clearly still unsettled.
> >
> > can we avoid using the word 'blessed' -- it's extremely vague and 
> > seems controversial. from what i know, no one is being told project 
> > x's services are the be all end all and based on experience, companies 
> > (should) know this. i've worked with other alternatives even though i 
> > contribute to ceilometer.
> > > Totally agree with Jay here, I know people who gave up on trying to
> > > get any official project around deployment because they were told they
> > > had to do it under the TripleO umbrella
> > from the pov of a project that seems to be brought up constantly and 
> > maybe it's my naivety, i don't really understand the fascination with 
> > branding and the stigma people have placed on 
> > non-'openstack'/stackforge projects. it can't be a legal thing because 
> > i've gone through that potential mess. also, it's just as easy to 
> > contribute to 'non-openstack' projects as 'openstack' projects (even 
> > easier if we're honest).
> Yes, we should be honest. The "even easier" part is what Sandy cited as 
> the primary motivation for pursuing stacktach instead of ceilometer.
> 
> I think we need to consider the difference between why OpenStack wants 
> to bless a project, and why a project might want to be blessed by 
> OpenStack. Many folks believe that for OpenStack to be successful it 
> needs to present itself as a stack that can be tested and deployed, not 
> a sack of parts that only the most extremely clever people can manage to 
> assemble into an actual cloud. In order to have such a stack, some code 
> (or, alternatively, dare I say API...) needs to be blessed. Reasonable 
> debates will continue about which pieces are essential to this stack, 
> and which should be left to deployers, but metering was seen as such a 
> component and therefore something needed to be blessed. The hope was 
> that every one would jump on that and make it great but it seems that 
> didn't quite happen (at least yet).
> 
> Though Open Source has many advantages over proprietary development, the 
> ability to choose a direction and marshal resources for efficient 
> delivery is the biggest advantage of proprietary development like what 
> AWS does. The TC process of blessing is, IMO, an attempt to compensate 
> for that in an OpenSource project. Of course if the wrong code is 
> blessed, the negative  impact can be significant. Blessing APIs would be 

Hm, I wonder if the only difference there is when AWS blesses the wrong
thing, they evaluate the business impact, and respond by going in a
different direction, all behind closed doors. The shame is limited to
that inner circle.

Here, with full transparency, calling something "the wrong thing" is
pretty much public humiliation for the team involved.

So it stands to reason that we shouldn't call something "the right
thing" if we aren't comfortable with the potential public shaming.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][Heat] Murano split dsicussion

2014-08-21 Thread Clint Byrum
Excerpts from Georgy Okrokvertskhov's message of 2014-08-20 13:14:28 -0700:
> During last Atlanta summit there were couple discussions about Application
> Catalog and Application space projects in OpenStack. These cross-project
> discussions occurred as a result of Murano incubation request [1] during
> Icehouse cycle.  On the TC meeting devoted to Murano incubation there was
> an idea about splitting the Murano into parts which might belong to
> different programs[2].
> 
> 
> Today, I would like to initiate a discussion about potential splitting of
> Murano between two or three programs.
> 
> 
> *App Catalog API to Catalog Program*
> 
> Application Catalog part can belong to Catalog program, the package
> repository will move to artifacts repository part where Murano team already
> participates. API part of App Catalog will add a thin layer of API methods
> specific to Murano applications and potentially can be implemented as a
> plugin to artifacts repository. Also this API layer will expose other 3rd
> party systems API like CloudFoundry ServiceBroker API which is used by
> CloudFoundry marketplace feature to provide an integration layer between
> OpenStack Application packages and 3rd party PaaS tools.
> 
> 

I thought this was basically already agreed upon, and that Glance was
just growing the ability to store more than just images.

> 
> *Murano Engine to Orchestration Program*
> 
> Murano engine orchestrates the Heat template generation. Complementary to a
> Heat declarative approach, Murano engine uses imperative approach so that
> it is possible to control the whole flow of the template generation. The
> engine uses Heat updates to update Heat templates to reflect changes in
> applications layout. Murano engine has a concept of actions - special flows
> which can be called at any time after application deployment to change
> application parameters or update stacks. The engine is actually
> complementary to Heat engine and adds the following:
> 
> 
>- orchestrate multiple Heat stacks - DR deployments, HA setups, multiple
>datacenters deployment

These sound like features already requested directly in Heat.

>- Initiate and controls stack updates on application specific events

Sounds like workflow. :)

>- Error handling and self-healing - being imperative Murano allows you
>to handle issues and implement additional logic around error handling and
>self-healing.

Also sounds like workflow.

> 


I think we need to re-think what a program is before we consider this.

I really don't know much about Murano. I have no interest in it at
all, and nobody has come to me saying "If we only had Murano in our
orchestration toolbox, we'd solve xxx." But making them part of the
Orchestration program would imply that we'll do design sessions together,
that we'll share the same mission statement, and that we'll have just
one PTL. I fail to see why they're not another, higher level program
that builds on top of the other services.

> 
> 
> *Murano UI to Dashboard Program*
> 
> Application Catalog requires  a UI focused on user experience. Currently
> there is a Horizon plugin for Murano App Catalog which adds Application
> catalog page to browse, search and filter applications. It also adds a
> dynamic UI functionality to render a Horizon forms without writing an
> actual code.
> 
> 

I feel like putting all the UI plugins in Horizon is the same mistake
as putting all of the functional tests in Tempest. It doesn't have the
affect of breaking the gate but it probably is a lot of burden on a
single team.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][Heat] Murano split dsicussion

2014-08-21 Thread Clint Byrum
Excerpts from Angus Salkeld's message of 2014-08-21 20:14:12 -0700:
> On Fri, Aug 22, 2014 at 12:34 PM, Clint Byrum  wrote:
> 
> > Excerpts from Georgy Okrokvertskhov's message of 2014-08-20 13:14:28 -0700:
> > > During last Atlanta summit there were couple discussions about
> > Application
> > > Catalog and Application space projects in OpenStack. These cross-project
> > > discussions occurred as a result of Murano incubation request [1] during
> > > Icehouse cycle.  On the TC meeting devoted to Murano incubation there was
> > > an idea about splitting the Murano into parts which might belong to
> > > different programs[2].
> > >
> > >
> > > Today, I would like to initiate a discussion about potential splitting of
> > > Murano between two or three programs.
> > >
> > >
> > > *App Catalog API to Catalog Program*
> > >
> > > Application Catalog part can belong to Catalog program, the package
> > > repository will move to artifacts repository part where Murano team
> > already
> > > participates. API part of App Catalog will add a thin layer of API
> > methods
> > > specific to Murano applications and potentially can be implemented as a
> > > plugin to artifacts repository. Also this API layer will expose other 3rd
> > > party systems API like CloudFoundry ServiceBroker API which is used by
> > > CloudFoundry marketplace feature to provide an integration layer between
> > > OpenStack Application packages and 3rd party PaaS tools.
> > >
> > >
> >
> > I thought this was basically already agreed upon, and that Glance was
> > just growing the ability to store more than just images.
> >
> > >
> > > *Murano Engine to Orchestration Program*
> > >
> > > Murano engine orchestrates the Heat template generation. Complementary
> > to a
> > > Heat declarative approach, Murano engine uses imperative approach so that
> > > it is possible to control the whole flow of the template generation. The
> > > engine uses Heat updates to update Heat templates to reflect changes in
> > > applications layout. Murano engine has a concept of actions - special
> > flows
> > > which can be called at any time after application deployment to change
> > > application parameters or update stacks. The engine is actually
> > > complementary to Heat engine and adds the following:
> > >
> > >
> > >- orchestrate multiple Heat stacks - DR deployments, HA setups,
> > multiple
> > >datacenters deployment
> >
> > These sound like features already requested directly in Heat.
> >
> > >- Initiate and controls stack updates on application specific events
> >
> > Sounds like workflow. :)
> >
> > >- Error handling and self-healing - being imperative Murano allows you
> > >to handle issues and implement additional logic around error handling
> > and
> > >self-healing.
> >
> > Also sounds like workflow.
> >
> > >
> >
> >
> > I think we need to re-think what a program is before we consider this.
> >
> > I really don't know much about Murano. I have no interest in it at
> >
> 
> "get off my lawn";)
> 

And turn down that music!

Sorry for the fist shaking, but I wan to highlight that I'm happy to
consider it, just not with programs working the way they do now.

> http://stackalytics.com/?project_type=all&module=murano-group
> 
> HP seems to be involved, you should check it out.
> 

HP is involved in a lot of OpenStack things. It's a bit hard for me to
keep my eyes on everything we do. Good to know that others have been able
to take some time and buy into it a bit. +1 for distributing the load. :)

> > all, and nobody has come to me saying "If we only had Murano in our
> > orchestration toolbox, we'd solve xxx." But making them part of the
> >
> 
> I thought you were saying that opsworks was neat the other day?
> Murano from what I understand was partly inspired from opsworks, yes
> it's a layer up, but still really the same field.
>

I was saying that OpsWorks is reportedly popular, yes. I did not make
the connection at all from OpsWorks to Murano, and nobody had pointed
that out to me until now.

> > Orchestration program would imply that we'll do design sessions together,
> > that we'll share the same mission statement, and that we'll have just
> >
> 
> This is exactly what I hope will happen.
> 

Which sessions

Re: [openstack-dev] [all] The future of the integrated release

2014-08-22 Thread Clint Byrum
Excerpts from Michael Chapman's message of 2014-08-21 23:30:44 -0700:
> On Fri, Aug 22, 2014 at 2:57 AM, Jay Pipes  wrote:
> 
> > On 08/19/2014 11:28 PM, Robert Collins wrote:
> >
> >> On 20 August 2014 02:37, Jay Pipes  wrote:
> >> ...
> >>
> >>  I'd like to see more unification of implementations in TripleO - but I
>  still believe our basic principle of using OpenStack technologies that
>  already exist in preference to third party ones is still sound, and
>  offers substantial dogfood and virtuous circle benefits.
> 
> >>>
> >>>
> >>> No doubt Triple-O serves a valuable dogfood and virtuous cycle purpose.
> >>> However, I would move that the Deployment Program should welcome the many
> >>> projects currently in the stackforge/ code namespace that do deployment
> >>> of
> >>> OpenStack using traditional configuration management tools like Chef,
> >>> Puppet, and Ansible. It cannot be argued that these configuration
> >>> management
> >>> systems are the de-facto way that OpenStack is deployed outside of HP,
> >>> and
> >>> they belong in the Deployment Program, IMO.
> >>>
> >>
> >> I think you mean it 'can be argued'... ;).
> >>
> >
> > No, I definitely mean "cannot be argued" :) HP is the only company I know
> > of that is deploying OpenStack using Triple-O. The vast majority of
> > deployers I know of are deploying OpenStack using configuration management
> > platforms and various systems or glue code for baremetal provisioning.
> >
> > Note that I am not saying that Triple-O is bad in any way! I'm only saying
> > that it does not represent the way that the majority of real-world
> > deployments are done.
> >
> >
> > > And I'd be happy if folk in
> >
> >> those communities want to join in the deployment program and have code
> >> repositories in openstack/. To date, none have asked.
> >>
> >
> > My point in this thread has been and continues to be that by having the TC
> > "bless" a certain project as The OpenStack Way of X, that we implicitly are
> > saying to other valid alternatives "Sorry, no need to apply here.".
> >
> >
> >  As a TC member, I would welcome someone from the Chef community proposing
> >>> the Chef cookbooks for inclusion in the Deployment program, to live under
> >>> the openstack/ code namespace. Same for the Puppet modules.
> >>>
> >>
> > While you may personally welcome the Chef community to propose joining the
> > deployment Program and living under the openstack/ code namespace, I'm just
> > saying that the impression our governance model and policies create is one
> > of exclusion, not inclusion. Hope that clarifies better what I've been
> > getting at.
> >
> >
> 
> (As one of the core reviewers for the Puppet modules)
> 
> Without a standardised package build process it's quite difficult to test
> trunk Puppet modules vs trunk official projects. This means we cut release
> branches some time after the projects themselves to give people a chance to
> test. Until this changes and the modules can be released with the same
> cadence as the integrated release I believe they should remain on
> Stackforge.
> 

Seems like the distros that build the packages are all doing lots of
daily-build type stuff that could somehow be leveraged to get over that.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-22 Thread Clint Byrum
Excerpts from Sean Dague's message of 2014-08-22 04:51:49 -0700:
> On 08/22/2014 01:30 AM, Michael Chapman wrote:
> > 
> > 
> > 
> > On Fri, Aug 22, 2014 at 2:57 AM, Jay Pipes  > > wrote:
> > 
> > On 08/19/2014 11:28 PM, Robert Collins wrote:
> > 
> > On 20 August 2014 02:37, Jay Pipes  > > wrote:
> > ...
> > 
> > I'd like to see more unification of implementations in
> > TripleO - but I
> > still believe our basic principle of using OpenStack
> > technologies that
> > already exist in preference to third party ones is still
> > sound, and
> > offers substantial dogfood and virtuous circle benefits.
> > 
> > 
> > 
> > No doubt Triple-O serves a valuable dogfood and virtuous
> > cycle purpose.
> > However, I would move that the Deployment Program should
> > welcome the many
> > projects currently in the stackforge/ code namespace that do
> > deployment of
> > OpenStack using traditional configuration management tools
> > like Chef,
> > Puppet, and Ansible. It cannot be argued that these
> > configuration management
> > systems are the de-facto way that OpenStack is deployed
> > outside of HP, and
> > they belong in the Deployment Program, IMO.
> > 
> > 
> > I think you mean it 'can be argued'... ;).
> > 
> > 
> > No, I definitely mean "cannot be argued" :) HP is the only company I
> > know of that is deploying OpenStack using Triple-O. The vast
> > majority of deployers I know of are deploying OpenStack using
> > configuration management platforms and various systems or glue code
> > for baremetal provisioning.
> > 
> > Note that I am not saying that Triple-O is bad in any way! I'm only
> > saying that it does not represent the way that the majority of
> > real-world deployments are done.
> > 
> > 
> > > And I'd be happy if folk in
> > 
> > those communities want to join in the deployment program and
> > have code
> > repositories in openstack/. To date, none have asked.
> > 
> > 
> > My point in this thread has been and continues to be that by having
> > the TC "bless" a certain project as The OpenStack Way of X, that we
> > implicitly are saying to other valid alternatives "Sorry, no need to
> > apply here.".
> > 
> > 
> > As a TC member, I would welcome someone from the Chef
> > community proposing
> > the Chef cookbooks for inclusion in the Deployment program,
> > to live under
> > the openstack/ code namespace. Same for the Puppet modules.
> > 
> > 
> > While you may personally welcome the Chef community to propose
> > joining the deployment Program and living under the openstack/ code
> > namespace, I'm just saying that the impression our governance model
> > and policies create is one of exclusion, not inclusion. Hope that
> > clarifies better what I've been getting at.
> > 
> > 
> > 
> > (As one of the core reviewers for the Puppet modules)
> > 
> > Without a standardised package build process it's quite difficult to
> > test trunk Puppet modules vs trunk official projects. This means we cut
> > release branches some time after the projects themselves to give people
> > a chance to test. Until this changes and the modules can be released
> > with the same cadence as the integrated release I believe they should
> > remain on Stackforge.
> > 
> > In addition and perhaps as a consequence, there isn't any public
> > integration testing at this time for the modules, although I know some
> > parties have developed and maintain their own.
> > 
> > The Chef modules may be in a different state, but it's hard for me to
> > recommend the Puppet modules become part of an official program at this
> > stage.
> 
> Is the focus of the Puppet modules only stable releases with packages?
> Puppet + git based deploys would be honestly a really handy thing
> (especially as lots of people end up having custom fixes for their
> site). The lack of CM tools for git based deploys is I think one of the
> reasons we seen people using DevStack as a generic installer.
> 

We have quite a bit of stuff built up in tripleo-image-elements to build
what amounts to a lightweight package inside of an image. I do kind of
wonder if we've abstracted enough to be able to build system packages
out of what we have created.

Probably not, but it might be worth looking at, as we are definitely
built for pointing at git and even pulling in specific refs.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Ironic] [TripleO] How to gracefully quiesce a box?

2014-08-22 Thread Clint Byrum
It has been brought to my attention that Ironic uses the biggest hammer
in the IPMI toolbox to control chassis power:

https://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/ipminative.py#n142

Which is

ret = ipmicmd.set_power('off', wait)

This is the most abrupt form, where the system power should be flipped
off at a hardware level. The "short press" on the power button would be
'shutdown' instead of 'off'.

I also understand that this has been brought up before, and that the
answer given was "SSH in and shut it down yourself." I can respect that
position, but I have run into a bit of a pickle using it. Observe:

- ssh box.ip "poweroff"
- poll ironic until power state is off.
  - This is a race. Ironic is asserting the power. As soon as it sees
that the power is off, it will turn it back on.

- ssh box.ip "halt"
  - NO way to know that this has worked. Once SSH is off and the network
stack is gone, I cannot actually verify that the disks were
unmounted properly, which is the primary area of concern that I
have.

This is particulary important if I'm issuing a rebuild + preserve
ephemeral, as it is likely I will have lots of I/O going on, and I want
to make sure that it is all quiesced before I reboot to replace the
software and reboot.

Perhaps I missed something. If so, please do educate me on how I can
achieve this without hacking around it. Currently my workaround is to
manually unmount the state partition, which is something system shutdown
is supposed to do and may become problematic if system processes are
holding it open.

It seems to me that Ironic should at least try to use the graceful
shutdown. There can be a timeout, but it would need to be something a user
can disable so if graceful never works we never just dump the power on the
box. Even a journaled filesystem will take quite a bit to do a full fsck.

The inability to gracefully shutdown in a reasonable amount of time
is an error state really, and I need to go to the box and inspect it,
which is precisely the reason we have ERROR states.

Thanks for your time. :)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic] [TripleO] How to gracefully quiesce a box?

2014-08-22 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2014-08-22 11:16:05 -0700:
> On 08/22/2014 01:48 PM, Clint Byrum wrote:
> > It has been brought to my attention that Ironic uses the biggest hammer
> > in the IPMI toolbox to control chassis power:
> >
> > https://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/ipminative.py#n142
> >
> > Which is
> >
> >  ret = ipmicmd.set_power('off', wait)
> >
> > This is the most abrupt form, where the system power should be flipped
> > off at a hardware level. The "short press" on the power button would be
> > 'shutdown' instead of 'off'.
> >
> > I also understand that this has been brought up before, and that the
> > answer given was "SSH in and shut it down yourself." I can respect that
> > position, but I have run into a bit of a pickle using it. Observe:
> >
> > - ssh box.ip "poweroff"
> > - poll ironic until power state is off.
> >- This is a race. Ironic is asserting the power. As soon as it sees
> >  that the power is off, it will turn it back on.
> >
> > - ssh box.ip "halt"
> >- NO way to know that this has worked. Once SSH is off and the network
> >  stack is gone, I cannot actually verify that the disks were
> >  unmounted properly, which is the primary area of concern that I
> >  have.
> >
> > This is particulary important if I'm issuing a rebuild + preserve
> > ephemeral, as it is likely I will have lots of I/O going on, and I want
> > to make sure that it is all quiesced before I reboot to replace the
> > software and reboot.
> >
> > Perhaps I missed something. If so, please do educate me on how I can
> > achieve this without hacking around it. Currently my workaround is to
> > manually unmount the state partition, which is something system shutdown
> > is supposed to do and may become problematic if system processes are
> > holding it open.
> >
> > It seems to me that Ironic should at least try to use the graceful
> > shutdown. There can be a timeout, but it would need to be something a user
> > can disable so if graceful never works we never just dump the power on the
> > box. Even a journaled filesystem will take quite a bit to do a full fsck.
> >
> > The inability to gracefully shutdown in a reasonable amount of time
> > is an error state really, and I need to go to the box and inspect it,
> > which is precisely the reason we have ERROR states.
> 
> What about placing a runlevel script in /etc/init.d/ and symlinking it 
> to run on shutdown -- i.e. /etc/rc0.d/? You could run fsync or unmount 
> the state partition in that script which would ensure disk state was 
> quiesced, no?

That's already what OS's do in their rc0.d.

My point is, I don't have any way to know that process happened, without
the box turning itself off after it succeeded.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Keystone][Marconi][Heat] Creating accounts in Keystone

2014-08-22 Thread Clint Byrum
I don't know how Zaqar does its magic, but I'd love to see simple signed
URLs rather than users/passwords. This would work for Heat as well. That
way we only have to pass in a single predictably formatted string.

Excerpts from Zane Bitter's message of 2014-08-22 14:35:38 -0700:
> Here's an interesting fact about Zaqar (the project formerly known as 
> Marconi) that I hadn't thought about before this week: it's probably the 
> first OpenStack project where a major part of the API primarily faces 
> software running in the cloud rather than facing the user.
> 
> That is to say, nobody is going to be sending themselves messages on 
> their laptop, from their laptop, via a cloud. At least one end of any 
> given queue is likely to be on a VM in the cloud.
> 
> That makes me wonder: how does Zaqar authenticate users who are sending 
> and receiving messages (as opposed to setting up the queues in the first 
> place)? Presumably using Keystone, in which case it will run into a 
> problem we've been struggling with in Heat since the very early days.
> 
> Keystone is generally a front end for an identity store with a 1:1 
> correspondence between users and actual natural persons. Only the 
> operator can add or remove accounts. This breaks down as soon as you 
> need to authenticate automated services running in the cloud - in 
> particular, you never ever want to store the credentials belonging to an 
> actual natural person in a server in the cloud.
> 
> Heat has managed to work around this to some extent (for those running 
> the Keystone v3 API) by creating users in a separate domain and more or 
> less doing our own authorisation for them. However, this requires action 
> on the part of the operator, and isn't an option for the end user. I 
> guess Zaqar could do something similar and pass out sets of credentials 
> good only for reading and writing to queues (respectively), but it seems 
> like it would be better if the user could create the keystone accounts 
> and set their own access control rules on the queues.
> 
> On AWS the very first thing a user does is create a bunch of IAM 
> accounts so that they virtually never have to use the credentials 
> associated with their natural person ever again. There are both user 
> accounts and service accounts - the latter IIUC have 
> automatically-rotating keys. Is there anything like this planned in 
> Keystone? Zaqar is likely only the first (I guess second, if you count 
> Heat) of many services that will need it.
> 
> I have this irrational fear that somebody is going to tell me that this 
> issue is the reason for the hierarchical-multitenancy idea - fear 
> because that both sounds like it requires intrusive changes in every 
> OpenStack project and fails to solve the problem. I hope somebody will 
> disabuse me of that notion in 3... 2... 1...
> 
> cheers,
> Zane.
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-23 Thread Clint Byrum
Excerpts from Dolph Mathews's message of 2014-08-22 09:45:37 -0700:
> On Fri, Aug 22, 2014 at 11:32 AM, Zane Bitter  wrote:
> 
> > On 22/08/14 11:19, Thierry Carrez wrote:
> >
> >> Zane Bitter wrote:
> >>
> >>> On 22/08/14 08:33, Thierry Carrez wrote:
> >>>
>  We also
>  still need someone to have the final say in case of deadlocked issues.
> 
> >>>
> >>> -1 we really don't.
> >>>
> >>
> >> I know we disagree on that :)
> >>
> >
> > No problem, you and I work in different programs so we can both get our
> > way ;)
> >
> >
> >  People say we don't have that many deadlocks in OpenStack for which the
>  PTL ultimate power is needed, so we could get rid of them. I'd argue
>  that the main reason we don't have that many deadlocks in OpenStack is
>  precisely *because* we have a system to break them if they arise.
> 
> >>>
> >>> s/that many/any/ IME and I think that threatening to break a deadlock by
> >>> fiat is just as bad as actually doing it. And by 'bad' I mean
> >>> community-poisoningly, trust-destroyingly bad.
> >>>
> >>
> >> I guess I've been active in too many dysfunctional free and open source
> >> software projects -- I put a very high value on the ability to make a
> >> final decision. Not being able to make a decision is about as
> >> community-poisoning, and also results in inability to make any
> >> significant change or decision.
> >>
> >
> > I'm all for getting a final decision, but a 'final' decision that has been
> > imposed from outside rather than internalised by the participants is...
> > rarely final.
> >
> 
> The expectation of a PTL isn't to stomp around and make "final" decisions,
> it's to step in when necessary and help both sides find the best solution.
> To moderate.
> 

Have we had many instances where a project's community divided into
two camps and dug in to the point where they actually needed active
moderation? And in those cases, was the PTL not already on one side of
said argument? I'd prefer specific examples here.

> >
> > I have yet to see a deadlock in Heat that wasn't resolved by better
> > communication.
> 
> 
> Moderation == bettering communication. I'm under the impression that you
> and Thierry are agreeing here, just from opposite ends of the same spectrum.
> 

I agree as well. PTL is a servant of the community, as any good leader
is. If the PTL feels they have to drop the hammer, or if an impass is
reached where they are asked to, it is because they have failed to get
everyone communicating effectively, not because "that's their job."

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] "heat.conf.sample is not up to date"

2014-08-24 Thread Clint Byrum
Guessing this is due to the new tox feature which randomizes python's
hash seed.

Excerpts from Mike Spreitzer's message of 2014-08-24 00:10:42 -0700:
> What is going on with this?  If I do a fresh clone of heat and run `tox 
> -epep8` then I get that complaint.  If I then run the recommended command 
> to fix it, and then `tox -epep8` again, I get the same complaint again --- 
> and with different differences exhibited!  The email below carries a 
> typescript showing this.
> 
> What I really need to know is what to do when committing a change that 
> really does require a change in the sample configuration file.  Of course 
> I tried running generate_sample.sh, but `tox -epep8` still complains. What 
> is the right procedure to get a correct sample committed?  BTW, I am doing 
> the following admittedly risky thing: I run DevStack, and make my changes 
> in /opt/stack/heat/.
> 
> Thanks,
> Mike
> 
> - Forwarded by Mike Spreitzer/Watson/IBM on 08/24/2014 03:03 AM -
> 
> From:   ubuntu@mjs-dstk-821a (Ubuntu)
> To: Mike Spreitzer/Watson/IBM@IBMUS, 
> Date:   08/24/2014 02:55 AM
> Subject:fresh flake fail
> 
> 
> 
> ubuntu@mjs-dstk-821a:~/code$ git clone 
> git://git.openstack.org/openstack/heat.git
> Cloning into 'heat'...
> remote: Counting objects: 49690, done.
> remote: Compressing objects: 100% (19765/19765), done.
> remote: Total 49690 (delta 36660), reused 39014 (delta 26526)
> Receiving objects: 100% (49690/49690), 7.92 MiB | 7.29 MiB/s, done.
> Resolving deltas: 100% (36660/36660), done.
> Checking connectivity... done.
> ubuntu@mjs-dstk-821a:~/code$ cd heat
> ubuntu@mjs-dstk-821a:~/code/heat$ tox -epep8
> pep8 create: /home/ubuntu/code/heat/.tox/pep8
> pep8 installdeps: -r/home/ubuntu/code/heat/requirements.txt, 
> -r/home/ubuntu/code/heat/test-requirements.txt
> pep8 develop-inst: /home/ubuntu/code/heat
> pep8 runtests: PYTHONHASHSEED='0'
> pep8 runtests: commands[0] | flake8 heat bin/heat-api bin/heat-api-cfn 
> bin/heat-api-cloudwatch bin/heat-engine bin/heat-manage contrib
> pep8 runtests: commands[1] | 
> /home/ubuntu/code/heat/tools/config/check_uptodate.sh
> --- /tmp/heat.ep2CBe/heat.conf.sample2014-08-24 06:52:54.16484 +
> +++ etc/heat/heat.conf.sample2014-08-24 06:48:13.66484 +
> @@ -164,7 +164,7 @@
>  
> #allowed_rpc_exception_modules=oslo.messaging.exceptions,nova.exception,cinder.exception,exceptions
>  
>  # Qpid broker hostname. (string value)
> -#qpid_hostname=heat
> +#qpid_hostname=localhost
>  
>  # Qpid broker port. (integer value)
>  #qpid_port=5672
> @@ -221,7 +221,7 @@
>  
>  # The RabbitMQ broker address where a single node is used.
>  # (string value)
> -#rabbit_host=heat
> +#rabbit_host=localhost
>  
>  # The RabbitMQ broker port where a single node is used.
>  # (integer value)
> check_uptodate.sh: heat.conf.sample is not up to date.
> check_uptodate.sh: Please run 
> /home/ubuntu/code/heat/tools/config/generate_sample.sh.
> ERROR: InvocationError: 
> '/home/ubuntu/code/heat/tools/config/check_uptodate.sh'
> pep8 runtests: commands[2] | 
> /home/ubuntu/code/heat/tools/requirements_style_check.sh requirements.txt 
> test-requirements.txt
> pep8 runtests: commands[3] | bash -c find heat -type f -regex '.*\.pot?' 
> -print0|xargs -0 -n 1 msgfmt --check-format -o /dev/null
> ___ summary 
> 
> ERROR:   pep8: commands failed
> ubuntu@mjs-dstk-821a:~/code/heat$ 
> ubuntu@mjs-dstk-821a:~/code/heat$ 
> ubuntu@mjs-dstk-821a:~/code/heat$ tools/config/generate_sample.sh
> ubuntu@mjs-dstk-821a:~/code/heat$ 
> ubuntu@mjs-dstk-821a:~/code/heat$ 
> ubuntu@mjs-dstk-821a:~/code/heat$ 
> ubuntu@mjs-dstk-821a:~/code/heat$ tox -epep8
> pep8 develop-inst-noop: /home/ubuntu/code/heat
> pep8 runtests: PYTHONHASHSEED='0'
> pep8 runtests: commands[0] | flake8 heat bin/heat-api bin/heat-api-cfn 
> bin/heat-api-cloudwatch bin/heat-engine bin/heat-manage contrib
> pep8 runtests: commands[1] | 
> /home/ubuntu/code/heat/tools/config/check_uptodate.sh
> --- /tmp/heat.DqIhK5/heat.conf.sample2014-08-24 06:54:34.62884 +
> +++ etc/heat/heat.conf.sample2014-08-24 06:53:51.54084 +
> @@ -159,10 +159,6 @@
>  # Size of RPC connection pool. (integer value)
>  #rpc_conn_pool_size=30
>  
> -# Modules of exceptions that are permitted to be recreated
> -# upon receiving exception data from an rpc call. (list value)
> -#allowed_rpc_exception_modules=oslo.messaging.exceptions,nova.exception,cinder.exception,exceptions
> -
>  # Qpid broker hostname. (string value)
>  #qpid_hostname=heat
>  
> @@ -301,15 +297,6 @@
>  # Heartbeat time-to-live. (integer value)
>  #matchmaker_heartbeat_ttl=600
>  
> -# Host to locate redis. (string value)
> -#host=127.0.0.1
> -
> -# Use this port to connect to redis host. (integer value)
> -#port=6379
> -
> -# Password for Redis server (optional). (string value)
> -#password=
> -
>  # Size of RPC greenthread pool. (integer value)
>  #rpc_thread_pool_size=64
>  
> @@ -1229,6 

Re: [openstack-dev] [qa][all][Heat] Packaging of functional tests

2014-08-26 Thread Clint Byrum
Excerpts from Steve Baker's message of 2014-08-26 14:25:46 -0700:
> On 27/08/14 03:18, David Kranz wrote:
> > On 08/26/2014 10:14 AM, Zane Bitter wrote:
> >> Steve Baker has started the process of moving Heat tests out of the
> >> Tempest repository and into the Heat repository, and we're looking
> >> for some guidance on how they should be packaged in a consistent way.
> >> Apparently there are a few projects already packaging functional
> >> tests in the package .tests.functional (alongside
> >> .tests.unit for the unit tests).
> >>
> >> That strikes me as odd in our context, because while the unit tests
> >> run against the code in the package in which they are embedded, the
> >> functional tests run against some entirely different code - whatever
> >> OpenStack cloud you give it the auth URL and credentials for. So
> >> these tests run from the outside, just like their ancestors in
> >> Tempest do.
> >>
> >> There's all kinds of potential confusion here for users and
> >> packagers. None of it is fatal and all of it can be worked around,
> >> but if we refrain from doing the thing that makes zero conceptual
> >> sense then there will be no problem to work around :)
> > Thanks, Zane. The point of moving functional tests to projects is to
> > be able to run more of them
> > in gate jobs for those projects, and allow tempest to survive being
> > stretched-to-breaking horizontally as we scale to more projects. At
> > the same time, there are benefits to the
> > tempest-as-all-in-one-functional-and-integration-suite that we should
> > try not to lose:
> >
> > 1. Strong integration testing without thinking too hard about the
> > actual dependencies
> > 2. Protection from mistaken or unwise api changes (tempest two-step
> > required)
> > 3. Exportability as a complete blackbox functional test suite that can
> > be used by Rally, RefStack, deployment validation, etc.
> >
> > I think (1) may be the most challenging because tests that are moved
> > out of tempest might be testing some integration that is not being
> > covered
> > by a scenario. We will need to make sure that tempest actually has a
> > complete enough set of tests to validate integration. Even if this is
> > all implemented in a way where tempest can see in-project tests as
> > "plugins", there will still not be time to run them all as part of
> > tempest on every commit to every project, so a selection will have to
> > be made.
> >
> > (2) is quite difficult. In Atlanta we talked about taking a copy of
> > functional tests into tempest for stable apis. I don't know how
> > workable that is but don't see any other real options except vigilance
> > in reviews of patches that change functional tests.
> >
> > (3) is what Zane was addressing. The in-project functional tests need
> > to be written in a way that they can, at least in some configuration,
> > run against a real cloud.
> >
> >
> >>
> >> I suspect from reading the previous thread about "In-tree functional
> >> test vision" that we may actually be dealing with three categories of
> >> test here rather than two:
> >>
> >> * Unit tests that run against the package they are embedded in
> >> * Functional tests that run against the package they are embedded in
> >> * Integration tests that run against a specified cloud
> >>
> >> i.e. the tests we are now trying to add to Heat might be
> >> qualitatively different from the .tests.functional
> >> suites that already exist in a few projects. Perhaps someone from
> >> Neutron and/or Swift can confirm?
> > That seems right, except that I would call the third "functional
> > tests" and not "integration tests", because the purpose is not really
> > integration but deep testing of a particular service. Tempest would
> > continue to focus on integration testing. Is there some controversy
> > about that?
> > The second category could include whitebox tests.
> >
> > I don't know about swift, but in neutron the intent was to have these
> > tests be configurable to run against a real cloud, or not. Maru Newby
> > would have details.
> >>
> >> I'd like to propose that tests of the third type get their own
> >> top-level package with a name of the form
> >> -integrationtests (second choice: -tempest
> >> on the principle that they're essentially plugins for Tempest). How
> >> would people feel about standardising that across OpenStack?
> > +1 But I would not call it "integrationtests" for the reason given above.
> >
> Because all heat does is interact with other services, what we call
> functional tests are actually integration tests. Sure, we could mock at
> the REST API level, but integration coverage is what we need most. This

I'd call that "faking", not mocking, but both could apply.

> lets us verify things like:
> - how heat handles races in other services leading to resources going
> into ERROR

A fake that predictably fails (and thus tests failure handling) will
result in better coverage than a real service that only fails when that
real service is br

Re: [openstack-dev] [Heat] Heat Juno Mid-cycle Meetup report

2014-08-27 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-08-27 08:41:29 -0700:
> On 27/08/14 11:04, Steven Hardy wrote:
> > On Wed, Aug 27, 2014 at 07:54:41PM +0530, Jyoti Ranjan wrote:
> >> I am little bit skeptical about using Swift for this use case because 
> >> of
> >> its eventual consistency issue. I am not sure Swift cluster is good to 
> >> be
> >> used for this kind of problem. Please note that Swift cluster may give 
> >> you
> >> old data at some point of time.
> >
> > This is probably not a major problem, but it's certainly worth considering.
> >
> > My assumption is that the latency of making the replicas consistent will be
> > small relative to the timeout for things like SoftwareDeployments, so all
> > we need is to ensure that instances  eventually get the new data, act on
> 
> That part is fine, but if they get the new data and then later get the 
> old data back again... that would not be so good.
> 

Agreed, and I had not considered that this can happen.

There is a not-so-simple answer though:

* Heat inserts this as initial metadata:

{"metadata": {}, "update-url": "xx", "version": 0}

* Polling goes to update-url and ignores metadata <= 0

* Polling finds new metadata in same format, and continues the loop
without talking to Heat

However, this makes me rethink why we are having performance problems.
MOST of the performance problems have two root causes:

* We parse the entire stack to show metadata, because we have to see if
  there are custom access controls defined in any of the resources used.
  I actually worked on a patch set to deprecate this part of the resource
  plugin API because it is impossible to scale this way.
* We rely on the engine to respond because of the parsing issue.

If however we could just push metadata into the db fully resolved
whenever things in the stack change, and cache the response in the API
using Last-Modified/Etag headers, I think we'd be less inclined to care
so much about swift for polling. However we are still left with the many
thousands of keystone users being created vs. thousands of swift tempurls.

That would also set us up nicely for very easy integration with Zaqar,
as metadata changes would flow naturally into the message queue for the
server through the same mechanism as they flow into the database.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Heat Juno Mid-cycle Meetup report

2014-08-27 Thread Clint Byrum
Excerpts from Steven Hardy's message of 2014-08-27 10:08:36 -0700:
> On Wed, Aug 27, 2014 at 09:40:31AM -0700, Clint Byrum wrote:
> > Excerpts from Zane Bitter's message of 2014-08-27 08:41:29 -0700:
> > > On 27/08/14 11:04, Steven Hardy wrote:
> > > > On Wed, Aug 27, 2014 at 07:54:41PM +0530, Jyoti Ranjan wrote:
> > > >> I am little bit skeptical about using Swift for this use case 
> > > >> because of
> > > >> its eventual consistency issue. I am not sure Swift cluster is 
> > > >> good to be
> > > >> used for this kind of problem. Please note that Swift cluster may 
> > > >> give you
> > > >> old data at some point of time.
> > > >
> > > > This is probably not a major problem, but it's certainly worth 
> > > > considering.
> > > >
> > > > My assumption is that the latency of making the replicas consistent 
> > > > will be
> > > > small relative to the timeout for things like SoftwareDeployments, so 
> > > > all
> > > > we need is to ensure that instances  eventually get the new data, act on
> > > 
> > > That part is fine, but if they get the new data and then later get the 
> > > old data back again... that would not be so good.
> > > 
> > 
> > Agreed, and I had not considered that this can happen.
> > 
> > There is a not-so-simple answer though:
> > 
> > * Heat inserts this as initial metadata:
> > 
> > {"metadata": {}, "update-url": "xx", "version": 0}
> > 
> > * Polling goes to update-url and ignores metadata <= 0
> > 
> > * Polling finds new metadata in same format, and continues the loop
> > without talking to Heat
> > 
> > However, this makes me rethink why we are having performance problems.
> > MOST of the performance problems have two root causes:
> > 
> > * We parse the entire stack to show metadata, because we have to see if
> >   there are custom access controls defined in any of the resources used.
> >   I actually worked on a patch set to deprecate this part of the resource
> >   plugin API because it is impossible to scale this way.
> > * We rely on the engine to respond because of the parsing issue.
> > 
> > If however we could just push metadata into the db fully resolved
> > whenever things in the stack change, and cache the response in the API
> > using Last-Modified/Etag headers, I think we'd be less inclined to care
> > so much about swift for polling. However we are still left with the many
> > thousands of keystone users being created vs. thousands of swift tempurls.
> 
> There's probably a few relatively simple optimisations we can do if the
> keystone user thing becomes the bottleneck:
> - Make the user an attribute of the stack and only create one per
>   stack/tree-of-stacks
> - Make the user an attribute of each server resource (probably more secure
>   but less optimal if your optimal is less keystone users).
> 
> I don't think the many keystone users thing is actually a problem right now
> though, or is it?

1000 servers means 1000 keystone users to manage, and all of the tokens
and backend churn that implies.

It's not "a problem", but it is quite a bit heavier than tempurls.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Design Summit reloaded

2014-08-27 Thread Clint Byrum
Excerpts from Thierry Carrez's message of 2014-08-27 05:51:55 -0700:
> Hi everyone,
> 
> I've been thinking about what changes we can bring to the Design Summit
> format to make it more productive. I've heard the feedback from the
> mid-cycle meetups and would like to apply some of those ideas for Paris,
> within the constraints we have (already booked space and time). Here is
> something we could do:
> 
> Day 1. Cross-project sessions / incubated projects / other projects
> 
> I think that worked well last time. 3 parallel rooms where we can
> address top cross-project questions, discuss the results of the various
> experiments we conducted during juno. Don't hesitate to schedule 2 slots
> for discussions, so that we have time to come to the bottom of those
> issues. Incubated projects (and maybe "other" projects, if space allows)
> occupy the remaining space on day 1, and could occupy "pods" on the
> other days.
> 

I like it. The only thing I would add is that it would be quite useful if
the use of pods were at least partially enhanced by an unconference style
interest list.  What I mean is, on day 1 have people suggest topics and
vote on suggested topics to discuss at the pods, and from then on the pods
can host these topics. This is for the "other" things that aren't well
defined until the summit and don't have their own rooms for days 2 and 3.

This is driven by the fact that the pods in Atlanta were almost always
busy doing something other than whatever the track that owned them
wanted. A few projects pods grew to 30-40 people a few times, eating up
all the chairs for the surrounding pods. TripleO often sat at the Heat
pod because of this for instance.

I don't think they should be fully scheduled. They're also just great
places to gather and have a good discussion, but it would be useful to
plan for topic flexibility and help coalesce interested parties, rather
than have them be silos that get taken over randomly. Especially since
there is a temptation to push the "other" topics to them already.

> Day 2 and Day 3. Scheduled sessions for various programs
> 
> That's our traditional scheduled space. We'll have a 33% less slots
> available. So, rather than trying to cover all the scope, the idea would
> be to focus those sessions on specific issues which really require
> face-to-face discussion (which can't be solved on the ML or using spec
> discussion) *or* require a lot of user feedback. That way, appearing in
> the general schedule is very helpful. This will require us to be a lot
> stricter on what we accept there and what we don't -- we won't have
> space for courtesy sessions anymore, and traditional/unnecessary
> sessions (like my traditional "release schedule" one) should just move
> to the mailing-list.
> 
> Day 4. Contributors meetups
> 
> On the last day, we could try to split the space so that we can conduct
> parallel midcycle-meetup-like contributors gatherings, with no time
> boundaries and an open agenda. Large projects could get a full day,
> smaller projects would get half a day (but could continue the discussion
> in a local bar). Ideally that meetup would end with some alignment on
> release goals, but the idea is to make the best of that time together to
> solve the issues you have. Friday would finish with the design summit
> feedback session, for those who are still around.
> 

Love this. Please if we can also fully enclose these meetups and the
session rooms in dry erase boards that would be ideal.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Design Summit reloaded

2014-08-27 Thread Clint Byrum
Excerpts from Sean Dague's message of 2014-08-27 06:26:38 -0700:
> On 08/27/2014 08:51 AM, Thierry Carrez wrote:
> > Hi everyone,
> > 
> > I've been thinking about what changes we can bring to the Design Summit
> > format to make it more productive. I've heard the feedback from the
> > mid-cycle meetups and would like to apply some of those ideas for Paris,
> > within the constraints we have (already booked space and time). Here is
> > something we could do:
> > 
> > Day 1. Cross-project sessions / incubated projects / other projects
> > 
> > I think that worked well last time. 3 parallel rooms where we can
> > address top cross-project questions, discuss the results of the various
> > experiments we conducted during juno. Don't hesitate to schedule 2 slots
> > for discussions, so that we have time to come to the bottom of those
> > issues. Incubated projects (and maybe "other" projects, if space allows)
> > occupy the remaining space on day 1, and could occupy "pods" on the
> > other days.
> > 
> > Day 2 and Day 3. Scheduled sessions for various programs
> > 
> > That's our traditional scheduled space. We'll have a 33% less slots
> > available. So, rather than trying to cover all the scope, the idea would
> > be to focus those sessions on specific issues which really require
> > face-to-face discussion (which can't be solved on the ML or using spec
> > discussion) *or* require a lot of user feedback. That way, appearing in
> > the general schedule is very helpful. This will require us to be a lot
> > stricter on what we accept there and what we don't -- we won't have
> > space for courtesy sessions anymore, and traditional/unnecessary
> > sessions (like my traditional "release schedule" one) should just move
> > to the mailing-list.
> > 
> > Day 4. Contributors meetups
> > 
> > On the last day, we could try to split the space so that we can conduct
> > parallel midcycle-meetup-like contributors gatherings, with no time
> > boundaries and an open agenda. Large projects could get a full day,
> > smaller projects would get half a day (but could continue the discussion
> > in a local bar). Ideally that meetup would end with some alignment on
> > release goals, but the idea is to make the best of that time together to
> > solve the issues you have. Friday would finish with the design summit
> > feedback session, for those who are still around.
> > 
> > 
> > I think this proposal makes the best use of our setup: discuss clear
> > cross-project issues, address key specific topics which need
> > face-to-face time and broader attendance, then try to replicate the
> > success of midcycle meetup-like open unscheduled time to discuss
> > whatever is hot at this point.
> > 
> > There are still details to work out (is it possible split the space,
> > should we use the usual design summit CFP website to organize the
> > "scheduled" time...), but I would first like to have your feedback on
> > this format. Also if you have alternative proposals that would make a
> > better use of our 4 days, let me know.
> 
> I definitely like this approach. I think it will be really interesting
> to collect feedback from people about the value they got from days 2 & 3
> vs. Day 4.
> 
> I also wonder if we should lose a slot from days 1 - 3 and expand the
> hallway time. Hallway track is always pretty interesting, and honestly
> at a lot of interesting ideas spring up. The 10 minute transitions often
> seem to feel like you are rushing between places too quickly some times.

Yes please. I'd also be fine with just giving back 5 minutes from each
session to facilitate this.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Design Summit reloaded

2014-08-27 Thread Clint Byrum
Excerpts from Anita Kuno's message of 2014-08-27 13:48:25 -0700:
> On 08/27/2014 02:46 PM, John Griffith wrote:
> > On Wed, Aug 27, 2014 at 9:25 AM, Flavio Percoco  wrote:
> > 
> >> On 08/27/2014 03:26 PM, Sean Dague wrote:
> >>> On 08/27/2014 08:51 AM, Thierry Carrez wrote:
>  Hi everyone,
> 
>  I've been thinking about what changes we can bring to the Design Summit
>  format to make it more productive. I've heard the feedback from the
>  mid-cycle meetups and would like to apply some of those ideas for Paris,
>  within the constraints we have (already booked space and time). Here is
>  something we could do:
> 
>  Day 1. Cross-project sessions / incubated projects / other projects
> 
>  I think that worked well last time. 3 parallel rooms where we can
>  address top cross-project questions, discuss the results of the various
>  experiments we conducted during juno. Don't hesitate to schedule 2 slots
>  for discussions, so that we have time to come to the bottom of those
>  issues. Incubated projects (and maybe "other" projects, if space allows)
>  occupy the remaining space on day 1, and could occupy "pods" on the
>  other days.
> 
>  Day 2 and Day 3. Scheduled sessions for various programs
> 
>  That's our traditional scheduled space. We'll have a 33% less slots
>  available. So, rather than trying to cover all the scope, the idea would
>  be to focus those sessions on specific issues which really require
>  face-to-face discussion (which can't be solved on the ML or using spec
>  discussion) *or* require a lot of user feedback. That way, appearing in
>  the general schedule is very helpful. This will require us to be a lot
>  stricter on what we accept there and what we don't -- we won't have
>  space for courtesy sessions anymore, and traditional/unnecessary
>  sessions (like my traditional "release schedule" one) should just move
>  to the mailing-list.
> 
>  Day 4. Contributors meetups
> 
>  On the last day, we could try to split the space so that we can conduct
>  parallel midcycle-meetup-like contributors gatherings, with no time
>  boundaries and an open agenda. Large projects could get a full day,
>  smaller projects would get half a day (but could continue the discussion
>  in a local bar). Ideally that meetup would end with some alignment on
>  release goals, but the idea is to make the best of that time together to
>  solve the issues you have. Friday would finish with the design summit
>  feedback session, for those who are still around.
> 
> 
>  I think this proposal makes the best use of our setup: discuss clear
>  cross-project issues, address key specific topics which need
>  face-to-face time and broader attendance, then try to replicate the
>  success of midcycle meetup-like open unscheduled time to discuss
>  whatever is hot at this point.
> 
>  There are still details to work out (is it possible split the space,
>  should we use the usual design summit CFP website to organize the
>  "scheduled" time...), but I would first like to have your feedback on
>  this format. Also if you have alternative proposals that would make a
>  better use of our 4 days, let me know.
> >>>
> >>> I definitely like this approach. I think it will be really interesting
> >>> to collect feedback from people about the value they got from days 2 & 3
> >>> vs. Day 4.
> >>>
> >>> I also wonder if we should lose a slot from days 1 - 3 and expand the
> >>> hallway time. Hallway track is always pretty interesting, and honestly
> >>> at a lot of interesting ideas spring up. The 10 minute transitions often
> >>> seem to feel like you are rushing between places too quickly some times.
> >>
> >> +1
> >>
> >> Last summit, it was basically impossible to do any hallway talking and
> >> even meet some folks face-2-face.
> >>
> >> Other than that, I think the proposal is great and makes sense to me.
> >>
> >> Flavio
> >>
> >> --
> >> @flaper87
> >> Flavio Percoco
> >>
> >> ___
> >> OpenStack-dev mailing list
> >> OpenStack-dev@lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>
> > ​Sounds like a great idea to me:
> > +1​
> > 
> > 
> > 
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> I think this is a great direction.
> 
> Here is my dilemma and it might just affect me. I attended 3 mid-cycles
> this release: one of Neutron's (there were 2), QA/Infra and Cinder. The
> Neutron and Cinder ones were mostly in pursuit of figuring out third
> party and exchanging information surrounding that (which I feel was
> successful). The QA/Infra one was, well even though I feel like I have
> 

Re: [openstack-dev] [Keystone][Marconi][Heat] Creating accounts in Keystone

2014-08-27 Thread Clint Byrum
Excerpts from Adam Young's message of 2014-08-24 20:17:34 -0700:
> On 08/23/2014 02:01 AM, Clint Byrum wrote:
> > I don't know how Zaqar does its magic, but I'd love to see simple signed
> > URLs rather than users/passwords. This would work for Heat as well. That
> > way we only have to pass in a single predictably formatted string.
> >
> > Excerpts from Zane Bitter's message of 2014-08-22 14:35:38 -0700:
> >> Here's an interesting fact about Zaqar (the project formerly known as
> >> Marconi) that I hadn't thought about before this week: it's probably the
> >> first OpenStack project where a major part of the API primarily faces
> 
> 
> 
> Nah, this is the direction we are headed.  Service users (out of LDAP!)  are 
> going to be the norm with a recent feature add to Keytone:
> 
> 
> http://adam.younglogic.com/2014/08/getting-service-users-out-of-ldap/
> 

This complicates the case by requiring me to get tokens and present
them, to cache them, etc. I just want to fetch and/or send messages.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Spam] Re: [Openstack][TripleO] [Ironic] What if undercloud machines down, can we reboot overcloud machines?

2014-08-28 Thread Clint Byrum
Excerpts from Jyoti Ranjan's message of 2014-08-27 21:20:19 -0700:
> I do agree but it create an extra requirement for Undercloud if we high
> availability is important criteria. Because of this, undercloud has to be
> there 24x7, 365 days and to make it available we need to have HA for this
> also. So, you indirectly mean that undercloud also should be designed
> keeping high availability in mind.

I'm worried that you may be overstating the needs of a typical cloud.

The undercloud needs to be able to reach a state of availability when
you need to boot boxes. Even if you are doing CD and _constantly_
rebooting boxes, you can take your undercloud down for an hour, as long
as it can be brought back up for emergencies.

However, Ironic has already been designed this way. I believe that
Ironic has a nice dynamic hash ring of server ownership, and if you
mark a conductor down, the other conductors will assume ownership of
the machines that it was holding. So the path to making this HA is
basically "add one more undercloud server."

Ironic experts, please tell me this is true, and not just something I
inserted into my own distorted version of reality to help me sleep at
night.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Heat AWS WaitCondition's count

2014-08-29 Thread Clint Byrum
There are still a few lingering wait conditions. They should probably be
cleaned up from tripleo-heat-templates.

Excerpts from Pavlo Shchelokovskyy's message of 2014-08-28 02:26:16 -0700:
> Hi all,
> 
> the AWS::CloudFormation::WaitCondition resource in Heat allows to update
> the 'count' property, although in real AWS this is prohibited (
> https://bugs.launchpad.net/heat/+bug/1340100).
> 
> My question is does TripleO still depends on this behavior of AWS
> WaitCondition in any way? I want to be sure that fixing the mentioned bug
> will not break TripleO.
> 
> Best regards,
> Pavlo Shchelokovskyy.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-04 Thread Clint Byrum
Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
> Greetings,
> 
> Last Tuesday the TC held the first graduation review for Zaqar. During
> the meeting some concerns arose. I've listed those concerns below with
> some comments hoping that it will help starting a discussion before the
> next meeting. In addition, I've added some comments about the project
> stability at the bottom and an etherpad link pointing to a list of use
> cases for Zaqar.
> 

Hi Flavio. This was an interesting read. As somebody whose attention has
recently been drawn to Zaqar, I am quite interested in seeing it
graduate.

> # Concerns
> 
> - Concern on operational burden of requiring NoSQL deploy expertise to
> the mix of openstack operational skills
> 
> For those of you not familiar with Zaqar, it currently supports 2 nosql
> drivers - MongoDB and Redis - and those are the only 2 drivers it
> supports for now. This will require operators willing to use Zaqar to
> maintain a new (?) NoSQL technology in their system. Before expressing
> our thoughts on this matter, let me say that:
> 
> 1. By removing the SQLAlchemy driver, we basically removed the chance
> for operators to use an already deployed "OpenStack-technology"
> 2. Zaqar won't be backed by any AMQP based messaging technology for
> now. Here's[0] a summary of the research the team (mostly done by
> Victoria) did during Juno
> 3. We (OpenStack) used to require Redis for the zmq matchmaker
> 4. We (OpenStack) also use memcached for caching and as the oslo
> caching lib becomes available - or a wrapper on top of dogpile.cache -
> Redis may be used in place of memcached in more and more deployments.
> 5. Ceilometer's recommended storage driver is still MongoDB, although
> Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).
> 
> That being said, it's obvious we already, to some extent, promote some
> NoSQL technologies. However, for the sake of the discussion, lets assume
> we don't.
> 
> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
> keep avoiding these technologies. NoSQL technologies have been around
> for years and we should be prepared - including OpenStack operators - to
> support these technologies. Not every tool is good for all tasks - one
> of the reasons we removed the sqlalchemy driver in the first place -
> therefore it's impossible to keep an homogeneous environment for all
> services.
> 

I whole heartedly agree that non traditional storage technologies that
are becoming mainstream are good candidates for use cases where SQL
based storage gets in the way. I wish there wasn't so much FUD
(warranted or not) about MongoDB, but that is the reality we live in.

> With this, I'm not suggesting to ignore the risks and the extra burden
> this adds but, instead of attempting to avoid it completely by not
> evolving the stack of services we provide, we should probably work on
> defining a reasonable subset of NoSQL services we are OK with
> supporting. This will help making the burden smaller and it'll give
> operators the option to choose.
> 
> [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/
> 
> 
> - Concern on should we really reinvent a queue system rather than
> piggyback on one
> 
> As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
> brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
> flavor on top. [0]
> 

I think Zaqar is more like SMTP and IMAP than AMQP. You're not really
trying to connect two processes in real time. You're trying to do fully
asynchronous messaging with fully randomized access to any message.

Perhaps somebody should explore whether the approaches taken by large
scale IMAP providers could be applied to Zaqar.

Anyway, I can't imagine writing a system to intentionally use the
semantics of IMAP and SMTP. I'd be very interested in seeing actual use
cases for it, apologies if those have been posted before.

> Some things that differentiate Zaqar from SQS is it's capability for
> supporting different protocols without sacrificing multi-tenantcy and
> other intrinsic features it provides. Some protocols you may consider
> for Zaqar are: STOMP, MQTT.
> 
> As far as the backend goes, Zaqar is not re-inventing it either. It sits
> on top of existing storage technologies that have proven to be fast and
> reliable for this task. The choice of using NoSQL technologies has a lot
> to do with this particular thing and the fact that Zaqar needs a storage
> capable of scaling, replicating and good support for failover.
> 

What's odd to me is that other systems like Cassandra and Riak are not
being discussed. There are well documented large scale message storage
systems on both, and neither is encumbered by the same licensing FUD
as MongoDB.

Anyway, again if we look at this as a place to storage and retrieve
messages, and not as a queue, then talking about databases, instead of
message brokers, makes a lot more sen

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-04 Thread Clint Byrum
Excerpts from Flavio Percoco's message of 2014-09-04 06:01:45 -0700:
> On 09/04/2014 02:14 PM, Sean Dague wrote:
> > On 09/04/2014 03:08 AM, Flavio Percoco wrote:
> >> Greetings,
> >>
> >> Last Tuesday the TC held the first graduation review for Zaqar. During
> >> the meeting some concerns arose. I've listed those concerns below with
> >> some comments hoping that it will help starting a discussion before the
> >> next meeting. In addition, I've added some comments about the project
> >> stability at the bottom and an etherpad link pointing to a list of use
> >> cases for Zaqar.
> >>
> >> # Concerns
> >>
> >> - Concern on operational burden of requiring NoSQL deploy expertise to
> >> the mix of openstack operational skills
> >>
> >> For those of you not familiar with Zaqar, it currently supports 2 nosql
> >> drivers - MongoDB and Redis - and those are the only 2 drivers it
> >> supports for now. This will require operators willing to use Zaqar to
> >> maintain a new (?) NoSQL technology in their system. Before expressing
> >> our thoughts on this matter, let me say that:
> >>
> >> 1. By removing the SQLAlchemy driver, we basically removed the chance
> >> for operators to use an already deployed "OpenStack-technology"
> >> 2. Zaqar won't be backed by any AMQP based messaging technology for
> >> now. Here's[0] a summary of the research the team (mostly done by
> >> Victoria) did during Juno
> >> 3. We (OpenStack) used to require Redis for the zmq matchmaker
> >> 4. We (OpenStack) also use memcached for caching and as the oslo
> >> caching lib becomes available - or a wrapper on top of dogpile.cache -
> >> Redis may be used in place of memcached in more and more deployments.
> >> 5. Ceilometer's recommended storage driver is still MongoDB, although
> >> Ceilometer has now support for sqlalchemy. (Please correct me if I'm 
> >> wrong).
> >>
> >> That being said, it's obvious we already, to some extent, promote some
> >> NoSQL technologies. However, for the sake of the discussion, lets assume
> >> we don't.
> >>
> >> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
> >> keep avoiding these technologies. NoSQL technologies have been around
> >> for years and we should be prepared - including OpenStack operators - to
> >> support these technologies. Not every tool is good for all tasks - one
> >> of the reasons we removed the sqlalchemy driver in the first place -
> >> therefore it's impossible to keep an homogeneous environment for all
> >> services.
> >>
> >> With this, I'm not suggesting to ignore the risks and the extra burden
> >> this adds but, instead of attempting to avoid it completely by not
> >> evolving the stack of services we provide, we should probably work on
> >> defining a reasonable subset of NoSQL services we are OK with
> >> supporting. This will help making the burden smaller and it'll give
> >> operators the option to choose.
> >>
> >> [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/
> > 
> > I've been one of the consistent voices concerned about a hard
> > requirement on adding NoSQL into the mix. So I'll explain that thinking
> > a bit more.
> > 
> > I feel like when the TC makes an integration decision previously this
> > has been about evaluating the project applying for integration, and if
> > they met some specific criteria they were told about some time in the
> > past. I think that's the wrong approach. It's a locally optimized
> > approach that fails to ask the more interesting question.
> > 
> > Is OpenStack better as a whole if this is a mandatory component of
> > OpenStack? Better being defined as technically better (more features,
> > less janky code work arounds, less unexpected behavior from the stack).
> > Better from the sense of easier or harder to run an actual cloud by our
> > Operators (taking into account what kinds of moving parts they are now
> > expected to manage). Better from the sense of a better user experience
> > in interacting with OpenStack as whole. Better from a sense that the
> > OpenStack release will experience less bugs, less unexpected cross
> > project interactions, an a greater overall feel of consistency so that
> > the OpenStack API feels like one thing.
> > 
> > https://dague.net/2014/08/26/openstack-as-layers/
> > 
> > One of the interesting qualities of Layers 1 & 2 is they all follow an
> > AMQP + RDBMS pattern (excepting swift). You can have a very effective
> > IaaS out of that stack. They are the things that you can provide pretty
> > solid integration testing on (and if you look at where everything stood
> > before the new TC mandates on testing / upgrade that was basically what
> > was getting integration tested). (Also note, I'll accept Barbican is
> > probably in the wrong layer, and should be a Layer 2 service.)
> > 
> > While large shops can afford to have a dedicated team to figure out how
> > to make mongo or redis HA, provide monitoring, have a DR plan for when a
> > huricane r

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-04 Thread Clint Byrum
Excerpts from Flavio Percoco's message of 2014-09-04 02:11:15 -0700:
> Hey Clint,
> 
> Thanks for reading, some comments in-line:
> 
> On 09/04/2014 10:30 AM, Clint Byrum wrote:
> > Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
> 
> [snip]
> 
> >> - Concern on should we really reinvent a queue system rather than
> >> piggyback on one
> >>
> >> As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
> >> brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
> >> flavor on top. [0]
> >>
> > 
> > I think Zaqar is more like SMTP and IMAP than AMQP. You're not really
> > trying to connect two processes in real time. You're trying to do fully
> > asynchronous messaging with fully randomized access to any message.
> > 
> > Perhaps somebody should explore whether the approaches taken by large
> > scale IMAP providers could be applied to Zaqar.
> > 
> > Anyway, I can't imagine writing a system to intentionally use the
> > semantics of IMAP and SMTP. I'd be very interested in seeing actual use
> > cases for it, apologies if those have been posted before.
> > 
> >> Some things that differentiate Zaqar from SQS is it's capability for
> >> supporting different protocols without sacrificing multi-tenantcy and
> >> other intrinsic features it provides. Some protocols you may consider
> >> for Zaqar are: STOMP, MQTT.
> >>
> >> As far as the backend goes, Zaqar is not re-inventing it either. It sits
> >> on top of existing storage technologies that have proven to be fast and
> >> reliable for this task. The choice of using NoSQL technologies has a lot
> >> to do with this particular thing and the fact that Zaqar needs a storage
> >> capable of scaling, replicating and good support for failover.
> >>
> > 
> > What's odd to me is that other systems like Cassandra and Riak are not
> > being discussed. There are well documented large scale message storage
> > systems on both, and neither is encumbered by the same licensing FUD
> > as MongoDB.
> 
> FWIW, they both have been discussed. As far as Cassandra goes, we raised
> the red flag after reading reading this post[0]. The post itself may be
> obsolete already but I don't think I have enough knowledge about
> Cassandra to actually figure this out. Some folks have come to us asking
> for a Cassandra driver and they were interested in contributing/working
> on one. I really hope that will happen someday, although it'll certainly
> happen as an external driver. Riak, on the other hand, was certainly a
> good candidate. What made us go with MongoDB and Redis is they're both
> good for the job, they are both likely already deployed in OpenStack
> clouds and we have enough knowledge to provide support and maintenance
> for both drivers.

It seems like Cassandra is good for when you're going to be writing all
the time but only reading once. I would agree that this makes it less
attractive for a generalized messaging platform, since you won't know
how users will consume the messages, and if they are constantly
reading then you'll have terrible performance.

> >> # Use Cases
> >>
> >> In addition to the aforementioned concerns and comments, I also would
> >> like to share an etherpad that contains some use cases that other
> >> integrated projects have for Zaqar[0]. The list is not exhaustive and
> >> it'll contain more information before the next meeting.
> >>
> >> [0] https://etherpad.openstack.org/p/zaqar-integrated-projects-use-cases
> >>
> > 
> > Just taking a look, there are two basic applications needed:
> > 
> > 1) An inbox. Horizon wants to know when snapshots are done. Heat wants
> > to know what happened during a stack action. Etc.
> > 
> > 2) A user-focused message queue. Heat wants to push data to agents.
> > Swift wants to synchronize processes when things happen.
> > 
> > To me, #1 is Zaqar as it is today. #2 is the one that I worry may not
> > be served best by bending #1 onto it.
> 
> Push semantics are being developed. We've had enough discussions that
> have helped preparing the ground for it. However, I believe both use
> cases could be covered by Zaqar as-is.
> 
> Could you elaborate a bit more on #2? Especially on why you think Zaqar
> as is can't serve this specific case?

The difference between 1 and 2 is that 2 is a true queueing problem. The
message should go away when it has been consumed, and the volume may be
rather high. With 1, you have a storage problem, and a database makes a
lot more sense. If users can stick to type 2 problems, they'll be able
to stay much more lightweight because they won't need a large data store
that supports random access.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] memory usage in devstack-gate (the oom-killer strikes again)

2014-09-08 Thread Clint Byrum
Excerpts from Joe Gordon's message of 2014-09-08 15:24:29 -0700:
> Hi All,
> 
> We have recently started seeing assorted memory issues in the gate
> including the oom-killer [0] and libvirt throwing memory errors [1].
> Luckily we run ps and dstat on every devstack run so we have some insight
> into why we are running out of memory. Based on the output from job taken
> at random [2][3] a typical run consists of:
> 
> * 68 openstack api processes alone
> * the following services are running 8 processes (number of CPUs on test
> nodes)
>   * nova-api (we actually run 24 of these, 8 compute, 8 EC2, 8 metadata)
>   * nova-conductor
>   * cinder-api
>   * glance-api
>   * trove-api
>   * glance-registry
>   * trove-conductor
> * together nova-api, nova-conductor, cinder-api alone take over 45 %MEM
> (note: some of that is memory usage is counted multiple times as RSS
> includes shared libraries)
> * based on dstat numbers, it looks like we don't use that much memory
> before tempest runs, and after tempest runs we use a lot of memory.
> 
> Based on this information I have two categories of questions:
> 
> 1) Should we explicitly set the number of workers that services use in
> devstack? Why have so many workers in a small all-in-one environment? What
> is the right balance here?

I'm kind of wondering why we aren't pushing everything to go the same
direction keystone did with apache. I may be crazy but apache gives us
all kinds of tools to tune around process forking that we'll have to
reinvent in our own daemon bits (like MaxRequestsPerChild to prevent
leaky or slow GC from eating all our memory over time).

Meanwhile, the idea on running api processes with ncpu is that we don't
want to block an API request if there is a CPU available to it. Of
course if we have enough cinder, nova, keystone, trove, etc. requests
all at one time that we do need to block, we defer to the CPU scheduler
of the box to do it, rather than queue things up at the event level.
This can lead to quite ugly CPU starvation issues, and that is a lot
easier to tune for if you have one tuning knob for apache + mod_wsgi
instead of nservices.

In production systems I'd hope that memory would be quite a bit more
available than on the bazillions of cloud instances that run tests. So,
while process-per-cpu-per-service is a large percentage of 8G, it is
a very small percentage of 24G+, which is a pretty normal amount of
memory to have on an all-in-one type of server that one might choose
as a baremetal controller. For VMs that are handling production loads,
It's a pretty easy trade-off to give them a little more RAM so they can
take advantage of all the CPU's as needed.

All this to say, since devstack is always expected to be run in a dev
context, and not production, I think it would make sense to dial it
back to 4 from ncpu.

> 
> 2) Should we be worried that some OpenStack services such as nova-api,
> nova-conductor and cinder-api take up so much memory? Does there memory
> usage keep growing over time, does anyone have any numbers to answer this?
> Why do these processes take up so much memory?

Yes I do think we should be worried that they grow quite a bit. I've
experienced this problem a few times in a few scripting languages, and
almost every time it turned out to be too much data being read from
the database or MQ. Moving to tighter messages, and tighter database
interaction, nearly always results in less wasted RAM.

I like the other suggestion to start graphing this. Since we have all
that dstat data, I wonder if we can just process that directly into
graphite.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

2014-09-09 Thread Clint Byrum
Excerpts from Mike Scherbakov's message of 2014-09-09 00:35:09 -0700:
> Hi all,
> please see below original email below from Dmitry. I've modified the
> subject to bring larger audience to the issue.
> 
> I'd like to split the issue into two parts:
> 
>1. Maintenance mode for OpenStack controllers in HA mode (HA-ed
>Keystone, Glance, etc.)
>2. Maintenance mode for OpenStack computes/storage nodes (no HA)
> 
> For first category, we might not need to have maintenance mode at all. For
> example, if we apply patching/upgrade one by one node to 3-node HA cluster,
> 2 nodes will serve requests normally. Is that possible for our HA solutions
> in Fuel, TripleO, other frameworks?

You may have a broken cloud if you are pushing out an update that
requires a new schema. Some services are better than others about
handling old schemas, and can be upgraded before doing schema upgrades.
But most of the time you have to do at least a brief downtime:

 * turn off DB accessing services
 * update code
 * run db migration
 * turn on DB accessing services

It is for this very reason, I believe, that Turbo Hipster was added to
the gate, so that deployers running against the upstream master branches
can have a chance at performing these upgrades in a reasonable amount of
time.

> 
> For second category, can not we simply do "nova-manage service disable...",
> so scheduler will simply stop scheduling new workloads on particular host
> which we want to do maintenance on?
> 

You probably would want 'nova host-servers-migrate ' at that
point, assuming you have migration set up.

http://docs.openstack.org/user-guide/content/novaclient_commands.html

> On Thu, Aug 28, 2014 at 6:44 PM, Dmitry Pyzhov  wrote:
> 
> > All,
> >
> > I'm not sure if it deserves to be mentioned in our documentation, this
> > seems to be a common practice. If an administrator wants to patch his
> > environment, he should be prepared for a temporary downtime of OpenStack
> > services. And he should plan to perform patching in advance: choose a time
> > with minimal load and warn users about possible interruptions of service
> > availability.
> >
> > Our current implementation of patching does not protect from downtime
> > during the patching procedure. HA deployments seems to be more or less
> > stable. But it looks like it is possible to schedule an action on a compute
> > node and get an error because of service restart. Deployments with one
> > controller... well, you won’t be able to use your cluster until the
> > patching is finished. There is no way to get rid of downtime here.
> >
> > As I understand, we can get rid of possible issues with computes in HA.
> > But it will require migration of instances and stopping of nova-compute
> > service before patching. And it will make the overall patching procedure
> > much longer. Do we want to investigate this process?
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-09 Thread Clint Byrum
Excerpts from Samuel Merritt's message of 2014-09-09 16:12:09 -0700:
> On 9/9/14, 12:03 PM, Monty Taylor wrote:
> > On 09/04/2014 01:30 AM, Clint Byrum wrote:
> >> Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
> >>> Greetings,
> >>>
> >>> Last Tuesday the TC held the first graduation review for Zaqar. During
> >>> the meeting some concerns arose. I've listed those concerns below with
> >>> some comments hoping that it will help starting a discussion before the
> >>> next meeting. In addition, I've added some comments about the project
> >>> stability at the bottom and an etherpad link pointing to a list of use
> >>> cases for Zaqar.
> >>>
> >>
> >> Hi Flavio. This was an interesting read. As somebody whose attention has
> >> recently been drawn to Zaqar, I am quite interested in seeing it
> >> graduate.
> >>
> >>> # Concerns
> >>>
> >>> - Concern on operational burden of requiring NoSQL deploy expertise to
> >>> the mix of openstack operational skills
> >>>
> >>> For those of you not familiar with Zaqar, it currently supports 2 nosql
> >>> drivers - MongoDB and Redis - and those are the only 2 drivers it
> >>> supports for now. This will require operators willing to use Zaqar to
> >>> maintain a new (?) NoSQL technology in their system. Before expressing
> >>> our thoughts on this matter, let me say that:
> >>>
> >>>  1. By removing the SQLAlchemy driver, we basically removed the
> >>> chance
> >>> for operators to use an already deployed "OpenStack-technology"
> >>>  2. Zaqar won't be backed by any AMQP based messaging technology for
> >>> now. Here's[0] a summary of the research the team (mostly done by
> >>> Victoria) did during Juno
> >>>  3. We (OpenStack) used to require Redis for the zmq matchmaker
> >>>  4. We (OpenStack) also use memcached for caching and as the oslo
> >>> caching lib becomes available - or a wrapper on top of dogpile.cache -
> >>> Redis may be used in place of memcached in more and more deployments.
> >>>  5. Ceilometer's recommended storage driver is still MongoDB,
> >>> although
> >>> Ceilometer has now support for sqlalchemy. (Please correct me if I'm
> >>> wrong).
> >>>
> >>> That being said, it's obvious we already, to some extent, promote some
> >>> NoSQL technologies. However, for the sake of the discussion, lets assume
> >>> we don't.
> >>>
> >>> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
> >>> keep avoiding these technologies. NoSQL technologies have been around
> >>> for years and we should be prepared - including OpenStack operators - to
> >>> support these technologies. Not every tool is good for all tasks - one
> >>> of the reasons we removed the sqlalchemy driver in the first place -
> >>> therefore it's impossible to keep an homogeneous environment for all
> >>> services.
> >>>
> >>
> >> I whole heartedly agree that non traditional storage technologies that
> >> are becoming mainstream are good candidates for use cases where SQL
> >> based storage gets in the way. I wish there wasn't so much FUD
> >> (warranted or not) about MongoDB, but that is the reality we live in.
> >>
> >>> With this, I'm not suggesting to ignore the risks and the extra burden
> >>> this adds but, instead of attempting to avoid it completely by not
> >>> evolving the stack of services we provide, we should probably work on
> >>> defining a reasonable subset of NoSQL services we are OK with
> >>> supporting. This will help making the burden smaller and it'll give
> >>> operators the option to choose.
> >>>
> >>> [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/
> >>>
> >>>
> >>> - Concern on should we really reinvent a queue system rather than
> >>> piggyback on one
> >>>
> >>> As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
> >>> brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
> >>> flavor on top. [0]
> >>>
> >>
> >> I think Zaqar is more like SMTP and IMAP than AMQP. You're not 

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-09 Thread Clint Byrum
Excerpts from Devananda van der Veen's message of 2014-09-09 16:47:27 -0700:
> On Tue, Sep 9, 2014 at 4:12 PM, Samuel Merritt  wrote:
> > On 9/9/14, 12:03 PM, Monty Taylor wrote:
> [snip]
> >> So which is it? Because it sounds like to me it's a thing that actually
> >> does NOT need to diverge in technology in any way, but that I've been
> >> told that it needs to diverge because it's delivering a different set of
> >> features - and I'm pretty sure if it _is_ the thing that needs to
> >> diverge in technology because of its feature set, then it's a thing I
> >> don't think we should be implementing in python in OpenStack because it
> >> already exists and it's called AMQP.
> >
> >
> > Whether Zaqar is more like AMQP or more like email is a really strange
> > metric to use for considering its inclusion.
> >
> 
> I don't find this strange at all -- I had been judging the technical
> merits of Zaqar (ex-Marconi) for the last ~18 months based on the
> understanding that it aimed to provide Queueing-as-a-Service, and
> found its delivery of that to be lacking on technical grounds. The
> implementation did not meet my view of what a queue service should
> provide; it is based on some serious antipatterns (storing a queue in
> an RDBMS is probably the most obvious); and in fact, it isn't even
> queue-like in the access patterns enabled by the REST API (random
> access to a set != a queue). That was the basis for a large part of my
> objections to the project over time, and a source of frustration for
> me as the developers justified many of their positions rather than
> accepted feedback and changed course during the incubation period. The
> reason for this seems clear now...
> 
> As was pointed out in the TC meeting today, Zaqar is (was?) actually
> aiming to provide Messaging-as-a-Service -- not queueing as a service!
> This is another way of saying "it's more like email and less like
> AMQP", which means my but-its-not-a-queue objection to the project's
> graduation is irrelevant, and I need to rethink about all my previous
> assessments of the project.

Well said.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-09 Thread Clint Byrum
Excerpts from Samuel Merritt's message of 2014-09-09 19:04:58 -0700:
> On 9/9/14, 4:47 PM, Devananda van der Veen wrote:
> > On Tue, Sep 9, 2014 at 4:12 PM, Samuel Merritt  wrote:
> >> On 9/9/14, 12:03 PM, Monty Taylor wrote:
> > [snip]
> >>> So which is it? Because it sounds like to me it's a thing that actually
> >>> does NOT need to diverge in technology in any way, but that I've been
> >>> told that it needs to diverge because it's delivering a different set of
> >>> features - and I'm pretty sure if it _is_ the thing that needs to
> >>> diverge in technology because of its feature set, then it's a thing I
> >>> don't think we should be implementing in python in OpenStack because it
> >>> already exists and it's called AMQP.
> >>
> >>
> >> Whether Zaqar is more like AMQP or more like email is a really strange
> >> metric to use for considering its inclusion.
> >>
> >
> > I don't find this strange at all -- I had been judging the technical
> > merits of Zaqar (ex-Marconi) for the last ~18 months based on the
> > understanding that it aimed to provide Queueing-as-a-Service, and
> > found its delivery of that to be lacking on technical grounds. The
> > implementation did not meet my view of what a queue service should
> > provide; it is based on some serious antipatterns (storing a queue in
> > an RDBMS is probably the most obvious); and in fact, it isn't even
> > queue-like in the access patterns enabled by the REST API (random
> > access to a set != a queue). That was the basis for a large part of my
> > objections to the project over time, and a source of frustration for
> > me as the developers justified many of their positions rather than
> > accepted feedback and changed course during the incubation period. The
> > reason for this seems clear now...
> >
> > As was pointed out in the TC meeting today, Zaqar is (was?) actually
> > aiming to provide Messaging-as-a-Service -- not queueing as a service!
> > This is another way of saying "it's more like email and less like
> > AMQP", which means my but-its-not-a-queue objection to the project's
> > graduation is irrelevant, and I need to rethink about all my previous
> > assessments of the project.
> >
> > The questions now before us are:
> > - should OpenStack include, in the integrated release, a
> > messaging-as-a-service component?
> 
> I certainly think so. I've worked on a few reasonable-scale web 
> applications, and they all followed the same pattern: HTTP app servers 
> serving requests quickly, background workers for long-running tasks, and 
> some sort of durable message-broker/queue-server thing for conveying 
> work from the first to the second.
> 
> A quick straw poll of my nearby coworkers shows that every non-trivial 
> web application that they've worked on in the last decade follows the 
> same pattern.
> 
> While not *every* application needs such a thing, web apps are quite 
> common these days, and Zaqar satisfies one of their big requirements. 
> Not only that, it does so in a way that requires much less babysitting 
> than run-your-own-broker does.
> 

I think you missed the distinction.

What you describe is _message queueing_. Not messaging. The difference
being the durability and addressability of each message.

As Devananda pointed out, a queue doesn't allow addressing the items in
the queue directly. You can generally only send, receive, ACK, or NACK.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] convergence flow diagrams

2014-09-09 Thread Clint Byrum
Excerpts from Angus Salkeld's message of 2014-09-08 17:15:04 -0700:
> On Mon, Sep 8, 2014 at 11:22 PM, Tyagi, Ishant  wrote:
> 
> >  Hi All,
> >
> >
> >
> > As per the heat mid cycle meetup whiteboard, we have created the
> > flowchart and sequence diagram for the convergence . Can you please review
> > these diagrams and provide your feedback?
> >
> >
> >
> > https://www.dropbox.com/sh/i8qbjtgfdxn4zx4/AAC6J-Nps8J12TzfuCut49ioa?dl=0
> >
> >
> Great! Good to see something.
> 
> 
> I was expecting something like:
> engine ~= like nova-conductor (it's the only process that talks to the db -
> make upgrading easier)

This complicates things immensely. The engine can just be the workers
too, we're just not going to do the observing and converging in the same
greenthread.

> observer - purely gets the actual state/properties and writes then to the
> db (via engine)

If you look closely at the diagrams, thats what it does.

> worker - has a "job" queue and grinds away at running those (resource
> actions)
> 

The convergence worker is just another set of RPC API calls that split
out work into isolated chunks.

> Then engine then "triggers" on differences on goal vs. actual state and
> create a job and sends it to the job queue.

Remember, we're not targeting continuous convergence yet. Just
convergence when we ask for things.

> - so, on create it sees there is no actual state so it sends a create job
> for the first resource to the worker queue

The diagram shows that, but confusingly says "is difference = 1". In
the original whiteboard this is 'if diff = DNE'. DNE stands for Does
Not Exist.

> - when the observer writes the new state for that resource it triggers the
> next resource create in the dependency tree.

Not the next resource create, but the next resource convergence. And not
just one either. I think one of the graphs was forgotten, it goes like
this:

https://www.dropbox.com/s/1h2ee151iriv4i1/resolve_graph.svg?dl=0

That is what we called "return happy" because we were at hour 9 or so of
talking and we got a bit punchy. I've renamed it 'resolve_graph'.

> - like any system that relies on notifications we need timeouts and each
> stack needs a periodic "notification" to make sure


This is, again, the continuous observer model.

https://review.openstack.org/#/c/100012/

>   that progress is been made or notify the user that no progress is being
> made.
> 
> One question about the observer (in either my setup or the one in the
> diagram).
> - If we are relying on rpc notifications all the observer processes will
> receive a copy of the same notification

Please read that spec. We talk about a filter.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-10 Thread Clint Byrum
Excerpts from Gordon Sim's message of 2014-09-10 06:18:52 -0700:
> On 09/10/2014 09:58 AM, Flavio Percoco wrote:
> > To clarify the doubts of what Zaqar is or it's not, let me quote what's
> > written in the project's overview section[0]:
> >
> > "Zaqar is a multi-tenant cloud messaging service for web developers.
> 
> How are different tenants isolated from each other? Can different 
> tenants access the same queue? If so, what does Zaqar do to prevent one 
> tenant from negatively affecting the other? If not, how is communication 
> with other tenants achieved.
> 
> Most messaging systems allow authorisation to be used to restrict what a 
> particular user can access and quotas to restrict their resource 
> consumption. What does Zaqar do differently?
> 
> > It
> > combines the ideas pioneered by Amazon's SQS product with additional
> > semantics to support event broadcasting.
> >
> > The service features a fully RESTful API, which developers can use to
> > send messages between various components of their SaaS and mobile
> > applications, by using a variety of communication patterns. Underlying
> > this API is an efficient messaging engine designed with scalability and
> > security in mind.
> >
> > Other OpenStack components can integrate with Zaqar to surface events
> > to end users and to communicate with guest agents that run in the
> > "over-cloud" layer.
> 
> I may be misunderstanding the last sentence, but I think *direct* 
> integration of other OpenStack services with Zaqar would be a bad idea.
> 
> Wouldn't this be better done through olso.messaging's notifications in 
> some way? and/or through some standard protocol (and there's more than 
> one to choose from)?
> 

It's not direct, nobody is suggesting that.

What people are suggesting is that a user would be able to tell Nova
to put any messages that would want to deliver in a _user_ focused
queue/inbox.

This has nothing to do with oslo.messaging. Users don't want many options
for backends. They want a simple message passing interface so they don't
have to babysit one and choose one.

Certainly the "undercloud" Zaqar API could be based on the existing
oslo.messaging notifications. A simple daemon that sits between the oslo
notifications firehose and Zaqar's user queues would be quite efficient.

However, putting the whole burden of talking directly to a notification
bus on the users is unnecessarily complex... especially if they use Java
and have no idea what oslo is.

> Communicating through a specific, fixed messaging system, with its own 
> unique protocol is actually a step backwards in my opinion, especially 
> for things that you want to keep as loosely coupled as possible. This is 
> exactly why various standard protocols emerged.
> 

You're thinking like an operator. Think like an application developer.
They're asking you "how do I subscribe to notifications about _just my
instances_ from Nova?", not "how do I pump 40,000 messages per second
through a message bus that I fully control?"

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-11 Thread Clint Byrum
Excerpts from Flavio Percoco's message of 2014-09-11 04:14:30 -0700:
> On 09/10/2014 03:45 PM, Gordon Sim wrote:
> > On 09/10/2014 01:51 PM, Thierry Carrez wrote:
> >> I think we do need, as Samuel puts it, "some sort of durable
> >> message-broker/queue-server thing". It's a basic application building
> >> block. Some claim it's THE basic application building block, more useful
> >> than database provisioning. It's definitely a layer above pure IaaS, so
> >> if we end up splitting OpenStack into layers this clearly won't be in
> >> the inner one. But I think "IaaS+" basic application building blocks
> >> belong in OpenStack one way or another. That's the reason I supported
> >> Designate ("everyone needs DNS") and Trove ("everyone needs DBs").
> >>
> >> With that said, I think yesterday there was a concern that Zaqar might
> >> not fill the "some sort of durable message-broker/queue-server thing"
> >> role well. The argument goes something like: if it was a queue-server
> >> then it should actually be built on top of Rabbit; if it was a
> >> message-broker it should be built on top of postfix/dovecot; the current
> >> architecture is only justified because it's something in between, so
> >> it's broken.
> > 
> > What is the distinction between a message broker and a queue server? To
> > me those terms both imply something broadly similar (message broker
> > perhaps being a little bit more generic). I could see Zaqar perhaps as
> > somewhere between messaging and data-storage.
> 
> I agree with Gordon here. I really don't know how to say this without
> creating more confusion. Zaqar is a messaging service. Messages are the
> most important entity in Zaqar. This, however, does not forbid anyone to
> use Zaqar as a queue. It has the required semantics, it guarantees FIFO
> and other queuing specific patterns. This doesn't mean Zaqar is trying
> to do something outside its scope, it comes for free.
> 

It comes with a huge cost actually, so saying it comes for free is a
misrepresentation. It is a side effect of developing a superset of
queueing. But that superset is only useful to a small number of your
stated use cases. Many of your use cases (including the one I've been
involved with, Heat pushing metadata to servers) are entirely served by
the much simpler, much lighter weight, pure queueing service.

> Is Zaqar being optimized as a *queuing* service? I'd say no. Our goal is
> to optimize Zaqar for delivering messages and supporting different
> messaging patterns.
> 

Awesome! Just please don't expect people to get excited about it for
the lighter weight queueing workloads that you've claimed as use cases.

I totally see Horizon using it to keep events for users. I see Heat
using it for stack events as well. I would bet that Trove would benefit
from being able to communicate messages to users.

But I think in between Zaqar and the backends will likely be a lighter
weight queue-only service that the users can just subscribe to when they
don't want an inbox. And I think that lighter weight queue service is
far more important for OpenStack than the full blown random access
inbox.

I think the reason such a thing has not appeared is because we were all
sort of running into "but Zaqar is already incubated". Now that we've
fleshed out the difference, I think those of us that need a lightweight
multi-tenant queue service should add it to OpenStack.  Separately. I hope
that doesn't offend you and the rest of the excellent Zaqar developers. It
is just a different thing.

> Should we remove all the semantics that allow people to use Zaqar as a
> queue service? I don't think so either. Again, the semantics are there
> because Zaqar is using them to do its job. Whether other folks may/may
> not use Zaqar as a queue service is out of our control.
> 
> This doesn't mean the project is broken.
> 

No, definitely not broken. It just isn't actually necessary for many of
the stated use cases.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-11 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-09-11 15:21:26 -0700:
> On 09/09/14 19:56, Clint Byrum wrote:
> > Excerpts from Samuel Merritt's message of 2014-09-09 16:12:09 -0700:
> >> On 9/9/14, 12:03 PM, Monty Taylor wrote:
> >>> On 09/04/2014 01:30 AM, Clint Byrum wrote:
> >>>> Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
> >>>>> Greetings,
> >>>>>
> >>>>> Last Tuesday the TC held the first graduation review for Zaqar. During
> >>>>> the meeting some concerns arose. I've listed those concerns below with
> >>>>> some comments hoping that it will help starting a discussion before the
> >>>>> next meeting. In addition, I've added some comments about the project
> >>>>> stability at the bottom and an etherpad link pointing to a list of use
> >>>>> cases for Zaqar.
> >>>>>
> >>>>
> >>>> Hi Flavio. This was an interesting read. As somebody whose attention has
> >>>> recently been drawn to Zaqar, I am quite interested in seeing it
> >>>> graduate.
> >>>>
> >>>>> # Concerns
> >>>>>
> >>>>> - Concern on operational burden of requiring NoSQL deploy expertise to
> >>>>> the mix of openstack operational skills
> >>>>>
> >>>>> For those of you not familiar with Zaqar, it currently supports 2 nosql
> >>>>> drivers - MongoDB and Redis - and those are the only 2 drivers it
> >>>>> supports for now. This will require operators willing to use Zaqar to
> >>>>> maintain a new (?) NoSQL technology in their system. Before expressing
> >>>>> our thoughts on this matter, let me say that:
> >>>>>
> >>>>>   1. By removing the SQLAlchemy driver, we basically removed the
> >>>>> chance
> >>>>> for operators to use an already deployed "OpenStack-technology"
> >>>>>   2. Zaqar won't be backed by any AMQP based messaging technology 
> >>>>> for
> >>>>> now. Here's[0] a summary of the research the team (mostly done by
> >>>>> Victoria) did during Juno
> >>>>>   3. We (OpenStack) used to require Redis for the zmq matchmaker
> >>>>>   4. We (OpenStack) also use memcached for caching and as the oslo
> >>>>> caching lib becomes available - or a wrapper on top of dogpile.cache -
> >>>>> Redis may be used in place of memcached in more and more deployments.
> >>>>>   5. Ceilometer's recommended storage driver is still MongoDB,
> >>>>> although
> >>>>> Ceilometer has now support for sqlalchemy. (Please correct me if I'm
> >>>>> wrong).
> >>>>>
> >>>>> That being said, it's obvious we already, to some extent, promote some
> >>>>> NoSQL technologies. However, for the sake of the discussion, lets assume
> >>>>> we don't.
> >>>>>
> >>>>> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
> >>>>> keep avoiding these technologies. NoSQL technologies have been around
> >>>>> for years and we should be prepared - including OpenStack operators - to
> >>>>> support these technologies. Not every tool is good for all tasks - one
> >>>>> of the reasons we removed the sqlalchemy driver in the first place -
> >>>>> therefore it's impossible to keep an homogeneous environment for all
> >>>>> services.
> >>>>>
> >>>>
> >>>> I whole heartedly agree that non traditional storage technologies that
> >>>> are becoming mainstream are good candidates for use cases where SQL
> >>>> based storage gets in the way. I wish there wasn't so much FUD
> >>>> (warranted or not) about MongoDB, but that is the reality we live in.
> >>>>
> >>>>> With this, I'm not suggesting to ignore the risks and the extra burden
> >>>>> this adds but, instead of attempting to avoid it completely by not
> >>>>> evolving the stack of services we provide, we should probably work on
> >>>>> defining a reasonable subset of NoSQL services we are OK with
> >>>>> supporting. This will help making the burden 

Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-12 Thread Clint Byrum
Excerpts from Thierry Carrez's message of 2014-09-12 02:16:42 -0700:
> Clint Byrum wrote:
> > Excerpts from Flavio Percoco's message of 2014-09-11 04:14:30 -0700:
> >> Is Zaqar being optimized as a *queuing* service? I'd say no. Our goal is
> >> to optimize Zaqar for delivering messages and supporting different
> >> messaging patterns.
> > 
> > Awesome! Just please don't expect people to get excited about it for
> > the lighter weight queueing workloads that you've claimed as use cases.
> > 
> > I totally see Horizon using it to keep events for users. I see Heat
> > using it for stack events as well. I would bet that Trove would benefit
> > from being able to communicate messages to users.
> > 
> > But I think in between Zaqar and the backends will likely be a lighter
> > weight queue-only service that the users can just subscribe to when they
> > don't want an inbox. And I think that lighter weight queue service is
> > far more important for OpenStack than the full blown random access
> > inbox.
> > 
> > I think the reason such a thing has not appeared is because we were all
> > sort of running into "but Zaqar is already incubated". Now that we've
> > fleshed out the difference, I think those of us that need a lightweight
> > multi-tenant queue service should add it to OpenStack.  Separately. I hope
> > that doesn't offend you and the rest of the excellent Zaqar developers. It
> > is just a different thing.
> > 
> >> Should we remove all the semantics that allow people to use Zaqar as a
> >> queue service? I don't think so either. Again, the semantics are there
> >> because Zaqar is using them to do its job. Whether other folks may/may
> >> not use Zaqar as a queue service is out of our control.
> >>
> >> This doesn't mean the project is broken.
> > 
> > No, definitely not broken. It just isn't actually necessary for many of
> > the stated use cases.
> 
> Clint,
> 
> If I read you correctly, you're basically saying the Zaqar is overkill
> for a lot of people who only want a multi-tenant queue service. It's
> doing A+B. Why does that prevent people who only need A from using it ?
> 
> Is it that it's actually not doing A well, from a user perspective ?
> Like the performance sucks, or it's missing a key primitive ?
> 
> Is it that it's unnecessarily complex to deploy, from a deployer
> perspective, and that something only doing A would be simpler, while
> covering most of the use cases?
> 
> Is it something else ?
> 
> I want to make sure I understand your objection. In the "user
> perspective" it might make sense to pursue both options as separate
> projects. In the "deployer perspective" case, having a project doing A+B
> and a project doing A doesn't solve anything. So this affects the
> decision we have to take next Tuesday...

I believe that Zaqar does two things, inbox semantics, and queue
semantics. I believe the queueing is a side-effect of needing some kind
of queue to enable users to store and subscribe to messages in the
inbox.

What I'd rather see is an API for queueing, and an API for inboxes
which integrates well with the queueing API. For instance, if a user
says "give me an inbox" I think Zaqar should return a queue handle for
sending into the inbox the same way Nova gives you a Neutron port if
you don't give it one. You might also ask for a queue to receive push
messages from the inbox. Point being, the queues are not the inbox,
and the inbox is not the queues.

However, if I just want a queue, just give me a queue. Don't store my
messages in a randomly addressable space, and don't saddle the deployer
with the burden of such storage. Put the queue API in front of a scalable
message queue and give me a nice simple HTTP API. Users would likely be
thrilled. Heat, Nova, Ceilometer, probably Trove and Sahara, could all
make use of just this. Only Horizon seems to need a place to keep the
messages around while users inspect them.

Whether that is two projects, or one, separation between the two API's,
and thus two very different types of backends, is something I think
will lead to more deployers wanting to deploy both, so that they can
bill usage appropriately and so that their users can choose wisely.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-12 Thread Clint Byrum
Excerpts from Mark McLoughlin's message of 2014-09-12 03:27:42 -0700:
> On Wed, 2014-09-10 at 14:51 +0200, Thierry Carrez wrote:
> > Flavio Percoco wrote:
> > > [...]
> > > Based on the feedback from the meeting[3], the current main concern is:
> > > 
> > > - Do we need a messaging service with a feature-set akin to SQS+SNS?
> > > [...]
> > 
> > I think we do need, as Samuel puts it, "some sort of durable
> > message-broker/queue-server thing". It's a basic application building
> > block. Some claim it's THE basic application building block, more useful
> > than database provisioning. It's definitely a layer above pure IaaS, so
> > if we end up splitting OpenStack into layers this clearly won't be in
> > the inner one. But I think "IaaS+" basic application building blocks
> > belong in OpenStack one way or another. That's the reason I supported
> > Designate ("everyone needs DNS") and Trove ("everyone needs DBs").
> > 
> > With that said, I think yesterday there was a concern that Zaqar might
> > not fill the "some sort of durable message-broker/queue-server thing"
> > role well. The argument goes something like: if it was a queue-server
> > then it should actually be built on top of Rabbit; if it was a
> > message-broker it should be built on top of postfix/dovecot; the current
> > architecture is only justified because it's something in between, so
> > it's broken.
> > 
> > I guess I don't mind that much zaqar being "something in between":
> > unless I misunderstood, exposing extra primitives doesn't prevent the
> > "queue-server" use case from being filled. Even considering the
> > message-broker case, I'm also not convinced building it on top of
> > postfix/dovecot would be a net win compared to building it on top of
> > Redis, to be honest.
> 
> AFAICT, this part of the debate boils down to the following argument:
> 
>   If Zaqar implemented messaging-as-a-service with only queuing 
>   semantics (and no random access semantics), it's design would 
>   naturally be dramatically different and "simply" implement a 
>   multi-tenant REST API in front of AMQP queues like this:
> 
> https://www.dropbox.com/s/yonloa9ytlf8fdh/ZaqarQueueOnly.png?dl=0
> 
>   and that this architecture would allow for dramatically improved 
>   throughput for end-users while not making the cost of providing the 
>   service prohibitive to operators.
> 
> You can't dismiss that argument out-of-hand, but I wonder (a) whether
> the claimed performance improvement is going to make a dramatic
> difference to the SQS-like use case and (b) whether backing this thing
> with an RDBMS and multiple highly available, durable AMQP broker
> clusters is going to be too much of a burden on operators for whatever
> performance improvements it does gain.

Having had experience taking queue-only data out of RDBMS's and even SMTP
solutions, and putting them into queues, I can say that it was generally
quite a bit more reliable and cheaper to maintain.

However, as I've been thinking about this more, I am concerned about the
complexity of trying to use a stateless protocol like HTTP for reliable
delivery, given that these queues all use a session model that relies
on connection persistence. That may very well invalidate my hypothesis.

> 
> But the troubling part of this debate is where we repeatedly batter the
> Zaqar team with hypotheses like these and appear to only barely
> entertain their carefully considered justification for their design
> decisions like:
> 
>   
> https://wiki.openstack.org/wiki/Frequently_asked_questions_%28Zaqar%29#Is_Zaqar_a_provisioning_service_or_a_data_API.3F
>   
> https://wiki.openstack.org/wiki/Frequently_asked_questions_%28Zaqar%29#What_messaging_patterns_does_Zaqar_support.3F
> 
> I would like to see an SQS-like API provided by OpenStack, I accept the
> reasons for Zaqar's design decisions to date, I respect that those
> decisions were made carefully by highly competent members of our
> community and I expect Zaqar to evolve (like all projects) in the years
> ahead based on more real-world feedback, new hypotheses or ideas, and
> lessons learned from trying things out.

I have read those and I truly believe that the Zaqar team, who I believe
are already a valuable part of the OpenStack family, are doing good work.
Seriously, I believe it is valuable as is and I trust them to do what
they have stated they will do.

Let me explain my position again. Heat is in dire need of a way to
communicate with instances that is efficient. It has no need for a full
messaging stack.. just a way for users to have things pushed from Heat
to their instances efficiently.

So, to reiterate why I keep going on about this: If a messaging service
is to become an integrated part of OpenStack's release, we should think
carefully about the ramifications for operators _and_ users of not
having a light weight queue-only option, when that seems to fit _most_
of the use cases.

___
OpenStack-dev 

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-12 Thread Clint Byrum
Excerpts from Flavio Percoco's message of 2014-09-12 00:22:35 -0700:
> On 09/12/2014 03:29 AM, Clint Byrum wrote:
> > Excerpts from Zane Bitter's message of 2014-09-11 15:21:26 -0700:
> >> On 09/09/14 19:56, Clint Byrum wrote:
> >>> Excerpts from Samuel Merritt's message of 2014-09-09 16:12:09 -0700:
> >>>> On 9/9/14, 12:03 PM, Monty Taylor wrote:
> >>>>> On 09/04/2014 01:30 AM, Clint Byrum wrote:
> >>>>>> Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
> >>>>>>> Greetings,
> >>>>>>>
> >>>>>>> Last Tuesday the TC held the first graduation review for Zaqar. During
> >>>>>>> the meeting some concerns arose. I've listed those concerns below with
> >>>>>>> some comments hoping that it will help starting a discussion before 
> >>>>>>> the
> >>>>>>> next meeting. In addition, I've added some comments about the project
> >>>>>>> stability at the bottom and an etherpad link pointing to a list of use
> >>>>>>> cases for Zaqar.
> >>>>>>>
> >>>>>>
> >>>>>> Hi Flavio. This was an interesting read. As somebody whose attention 
> >>>>>> has
> >>>>>> recently been drawn to Zaqar, I am quite interested in seeing it
> >>>>>> graduate.
> >>>>>>
> >>>>>>> # Concerns
> >>>>>>>
> >>>>>>> - Concern on operational burden of requiring NoSQL deploy expertise to
> >>>>>>> the mix of openstack operational skills
> >>>>>>>
> >>>>>>> For those of you not familiar with Zaqar, it currently supports 2 
> >>>>>>> nosql
> >>>>>>> drivers - MongoDB and Redis - and those are the only 2 drivers it
> >>>>>>> supports for now. This will require operators willing to use Zaqar to
> >>>>>>> maintain a new (?) NoSQL technology in their system. Before expressing
> >>>>>>> our thoughts on this matter, let me say that:
> >>>>>>>
> >>>>>>>   1. By removing the SQLAlchemy driver, we basically removed the
> >>>>>>> chance
> >>>>>>> for operators to use an already deployed "OpenStack-technology"
> >>>>>>>   2. Zaqar won't be backed by any AMQP based messaging technology 
> >>>>>>> for
> >>>>>>> now. Here's[0] a summary of the research the team (mostly done by
> >>>>>>> Victoria) did during Juno
> >>>>>>>   3. We (OpenStack) used to require Redis for the zmq matchmaker
> >>>>>>>   4. We (OpenStack) also use memcached for caching and as the oslo
> >>>>>>> caching lib becomes available - or a wrapper on top of dogpile.cache -
> >>>>>>> Redis may be used in place of memcached in more and more deployments.
> >>>>>>>   5. Ceilometer's recommended storage driver is still MongoDB,
> >>>>>>> although
> >>>>>>> Ceilometer has now support for sqlalchemy. (Please correct me if I'm
> >>>>>>> wrong).
> >>>>>>>
> >>>>>>> That being said, it's obvious we already, to some extent, promote some
> >>>>>>> NoSQL technologies. However, for the sake of the discussion, lets 
> >>>>>>> assume
> >>>>>>> we don't.
> >>>>>>>
> >>>>>>> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
> >>>>>>> keep avoiding these technologies. NoSQL technologies have been around
> >>>>>>> for years and we should be prepared - including OpenStack operators - 
> >>>>>>> to
> >>>>>>> support these technologies. Not every tool is good for all tasks - one
> >>>>>>> of the reasons we removed the sqlalchemy driver in the first place -
> >>>>>>> therefore it's impossible to keep an homogeneous environment for all
> >>>>>>> services.
> >>>>>>>
> >>>>>>
> >>>>>> I whole heartedly agree tha

Re: [openstack-dev] [Heat] Defining what is a SupportStatus version

2014-09-14 Thread Clint Byrum
Excerpts from Gauvain Pocentek's message of 2014-09-04 22:29:05 -0700:
> Hi,
> 
> A bit of background: I'm working on the publication of the HOT 
> resources reference on docs.openstack.org. This book is mostly 
> autogenerated from the heat source code, using the sphinx XML output. To 
> avoid publishing several references (one per released version, as is 
> done for the OpenStack config-reference), I'd like to add information 
> about the support status of each resource (when they appeared, when 
> they've been deprecated, and so on).
> 
> So the plan is to use the SupportStatus class and its `version` 
> attribute (see https://review.openstack.org/#/c/116443/ ). And the 
> question is, what information should the version attribute hold? 
> Possibilities include the release code name (Icehouse, Juno), or the 
> release version (2014.1, 2014.2). But this wouldn't be useful for users 
> of clouds continuously deployed.
> 
>  From my documenter point of view, using the code name seems the right 
> option, because it fits with the rest of the documentation.
> 
> What do you think would be the best choice from the heat devs POV?

What we ship in-tree is the standard library for Heat. I think Heat
should not tie things to the release of OpenStack, but only to itself.

The idea is to simply version the standard library of resources separately
even from the language. Added resources and properties would be minor
bumps, deprecating or removing anything would be a major bump. Users then
just need an API call that allows querying the standard library version.

With this scheme, we can provide a gate test that prevents breaking the
rules, and automatically generate the docs still. Doing this would sync
better with continuous deployers who will be running "Juno" well before
there is a "2014.2".

Anyway, Heat largely exists to support portability of apps between
OpenStack clouds. Many many OpenStack clouds don't run one release,
and we don't require them to do so. So tying to the release is, IMO,
a poor coice. We do the same thing with HOT's internals, so why not also
do the standard library this way?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][heat][ironic] Heat Ironic resources and "ready state" orchestration

2014-09-15 Thread Clint Byrum
Excerpts from Steven Hardy's message of 2014-09-15 04:44:24 -0700:
> All,
> 
> Starting this thread as a follow-up to a strongly negative reaction by the
> Ironic PTL to my patches[1] adding initial Heat->Ironic integration, and
> subsequent very detailed justification and discussion of why they may be
> useful in this spec[2].
> 
> Back in Atlanta, I had some discussions with folks interesting in making
> "ready state"[3] preparation of bare-metal resources possible when
> deploying bare-metal nodes via TripleO/Heat/Ironic.
> 
> The initial assumption is that there is some discovery step (either
> automatic or static generation of a manifest of nodes), that can be input
> to either Ironic or Heat.
> 
> Following discovery, but before an undercloud deploying OpenStack onto the
> nodes, there are a few steps which may be desired, to get the hardware into
> a state where it's ready and fully optimized for the subsequent deployment:
> 
> - Updating and aligning firmware to meet requirements of qualification or
>   site policy
> - Optimization of BIOS configuration to match workloads the node is
>   expected to run
> - Management of machine-local storage, e.g configuring local RAID for
>   optimal resilience or performance.
> 
> Interfaces to Ironic are landing (or have landed)[4][5][6] which make many
> of these steps possible, but there's no easy way to either encapsulate the
> (currently mostly vendor specific) data associated with each step, or to
> coordinate sequencing of the steps.
> 

First, Ironic is hidden under Nova as far as TripleO is concerned. So
mucking with the servers underneath Nova during deployment is a difficult
proposition. Would I look up the Ironic node ID of the nova server,
and then optimize it for the workload after the workload arrived? Why
wouldn't I just do that optimization before the deployment?

> What is required is some tool to take a text definition of the required
> configuration, turn it into a correctly sequenced series of API calls to
> Ironic, expose any data associated with those API calls, and declare
> success or failure on completion.  This is what Heat does.
> 

I'd rather see Ironic define or adopt a narrow scope document format
that it can consume for bulk loading. Heat is extremely generic, and thus
carries a ton of complexity for what is probably doable with a CSV file.

> So the idea is to create some basic (contrib, disabled by default) Ironic
> heat resources, then explore the idea of orchestrating ready-state
> configuration via Heat.
> 
> Given that Devananda and I have been banging heads over this for some time
> now, I'd like to get broader feedback of the idea, my interpretation of
> "ready state" applied to the tripleo undercloud, and any alternative
> implementation ideas.
> 

I think there may be value in being able to tie Ironic calls to other
OpenStack API calls. I'm dubious that this is an important idea, but
I think if somebody wants to step forward with their use case for it,
then the resources might make sense. However, I realy don't see the
_enrollment_ phase as capturing any value that isn't entirely offset by
added complexity.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][heat][ironic] Heat Ironic resources and "ready state" orchestration

2014-09-15 Thread Clint Byrum
Excerpts from James Slagle's message of 2014-09-15 08:15:21 -0700:
> On Mon, Sep 15, 2014 at 7:44 AM, Steven Hardy  wrote:
> > All,
> >
> > Starting this thread as a follow-up to a strongly negative reaction by the
> > Ironic PTL to my patches[1] adding initial Heat->Ironic integration, and
> > subsequent very detailed justification and discussion of why they may be
> > useful in this spec[2].
> >
> > Back in Atlanta, I had some discussions with folks interesting in making
> > "ready state"[3] preparation of bare-metal resources possible when
> > deploying bare-metal nodes via TripleO/Heat/Ironic.
> 
> After a cursory reading of the references, it seems there's a couple of 
> issues:
> - are the features to move hardware to a "ready-state" even going to
> be in Ironic proper, whether that means in ironic at all or just in
> contrib.
> - assuming some of the features are there, should Heat have any Ironic
> resources given that Ironic's API is admin-only.
> 
> >
> > The initial assumption is that there is some discovery step (either
> > automatic or static generation of a manifest of nodes), that can be input
> > to either Ironic or Heat.
> 
> I think it makes a lot of sense to use Heat to do the bulk
> registration of nodes via Ironic. I understand the argument that the
> Ironic API should be "admin-only" a little bit for the non-TripleO
> case, but for TripleO, we only have admins interfacing with the
> Undercloud. The user of a TripleO undercloud is the deployer/operator
> and in some scenarios this may not be the undercloud admin. So,
> talking about TripleO, I don't really buy that the Ironic API is
> admin-only.
> 
> Therefore, why not have some declarative Heat resources for things
> like Ironic nodes, that the deployer can make use of in a Heat
> template to do bulk node registration?
> 
> The alternative listed in the spec:
> 
> "Don’t implement the resources and rely on scripts which directly
> interact with the Ironic API, prior to any orchestration via Heat."
> 
> would just be a bit silly IMO. That goes against one of the main
> drivers of TripleO, which is to use OpenStack wherever possible. Why
> go off and write some other thing that is going to parse a
> json/yaml/csv of nodes and orchestrate a bunch of Ironic api calls?
> Why would it be ok for that other thing to use Ironic's "admin-only"
> API yet claim it's not ok for Heat on the undercloud to do so?
> 

An alternative that is missed, is to just define a bulk loading format
for hardware, or adopt an existing one (I find it hard to believe there
isn't already an open format for this), and make use of it in Ironic.

The analogy I'd use is shipping dry goods in a refrigerated truck.
It's heavier, has a bit less capacity, and unnecessary features.  If all
you have is the refrigerated truck, ok. But we're talking about _building_
a special dry-goods add-on to our refrigerated truck (Heat) to avoid
building the same thing into the regular trucks we already have (Ironic).

> > Following discovery, but before an undercloud deploying OpenStack onto the
> > nodes, there are a few steps which may be desired, to get the hardware into
> > a state where it's ready and fully optimized for the subsequent deployment:
> >
> > - Updating and aligning firmware to meet requirements of qualification or
> >   site policy
> > - Optimization of BIOS configuration to match workloads the node is
> >   expected to run
> > - Management of machine-local storage, e.g configuring local RAID for
> >   optimal resilience or performance.
> >
> > Interfaces to Ironic are landing (or have landed)[4][5][6] which make many
> > of these steps possible, but there's no easy way to either encapsulate the
> > (currently mostly vendor specific) data associated with each step, or to
> > coordinate sequencing of the steps.
> >
> > What is required is some tool to take a text definition of the required
> > configuration, turn it into a correctly sequenced series of API calls to
> > Ironic, expose any data associated with those API calls, and declare
> > success or failure on completion.  This is what Heat does.
> >
> > So the idea is to create some basic (contrib, disabled by default) Ironic
> > heat resources, then explore the idea of orchestrating ready-state
> > configuration via Heat.
> >
> > Given that Devananda and I have been banging heads over this for some time
> > now, I'd like to get broader feedback of the idea, my interpretation of
> > "ready state" applied to the tripleo undercloud, and any alternative
> > implementation ideas.
> 
> My opinion is that if the features are in Ironic, they should be
> exposed via Heat resources for orchestration. If the TripleO case is
> too much of a one-off (which I don't really think it is), then sure,
> keep it all in contrib so that no one gets confused about why the
> resources are there.
> 

And I think if this is a common thing that Ironic users need to do,
then Ironic should do it, not Heat.

__

Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-15 Thread Clint Byrum
Excerpts from Flavio Percoco's message of 2014-09-15 00:57:05 -0700:
> On 09/12/2014 07:13 PM, Clint Byrum wrote:
> > Excerpts from Thierry Carrez's message of 2014-09-12 02:16:42 -0700:
> >> Clint Byrum wrote:
> >>> Excerpts from Flavio Percoco's message of 2014-09-11 04:14:30 -0700:
> >>>> Is Zaqar being optimized as a *queuing* service? I'd say no. Our goal is
> >>>> to optimize Zaqar for delivering messages and supporting different
> >>>> messaging patterns.
> >>>
> >>> Awesome! Just please don't expect people to get excited about it for
> >>> the lighter weight queueing workloads that you've claimed as use cases.
> >>>
> >>> I totally see Horizon using it to keep events for users. I see Heat
> >>> using it for stack events as well. I would bet that Trove would benefit
> >>> from being able to communicate messages to users.
> >>>
> >>> But I think in between Zaqar and the backends will likely be a lighter
> >>> weight queue-only service that the users can just subscribe to when they
> >>> don't want an inbox. And I think that lighter weight queue service is
> >>> far more important for OpenStack than the full blown random access
> >>> inbox.
> >>>
> >>> I think the reason such a thing has not appeared is because we were all
> >>> sort of running into "but Zaqar is already incubated". Now that we've
> >>> fleshed out the difference, I think those of us that need a lightweight
> >>> multi-tenant queue service should add it to OpenStack.  Separately. I hope
> >>> that doesn't offend you and the rest of the excellent Zaqar developers. It
> >>> is just a different thing.
> >>>
> >>>> Should we remove all the semantics that allow people to use Zaqar as a
> >>>> queue service? I don't think so either. Again, the semantics are there
> >>>> because Zaqar is using them to do its job. Whether other folks may/may
> >>>> not use Zaqar as a queue service is out of our control.
> >>>>
> >>>> This doesn't mean the project is broken.
> >>>
> >>> No, definitely not broken. It just isn't actually necessary for many of
> >>> the stated use cases.
> >>
> >> Clint,
> >>
> >> If I read you correctly, you're basically saying the Zaqar is overkill
> >> for a lot of people who only want a multi-tenant queue service. It's
> >> doing A+B. Why does that prevent people who only need A from using it ?
> >>
> >> Is it that it's actually not doing A well, from a user perspective ?
> >> Like the performance sucks, or it's missing a key primitive ?
> >>
> >> Is it that it's unnecessarily complex to deploy, from a deployer
> >> perspective, and that something only doing A would be simpler, while
> >> covering most of the use cases?
> >>
> >> Is it something else ?
> >>
> >> I want to make sure I understand your objection. In the "user
> >> perspective" it might make sense to pursue both options as separate
> >> projects. In the "deployer perspective" case, having a project doing A+B
> >> and a project doing A doesn't solve anything. So this affects the
> >> decision we have to take next Tuesday...
> > 
> > I believe that Zaqar does two things, inbox semantics, and queue
> > semantics. I believe the queueing is a side-effect of needing some kind
> > of queue to enable users to store and subscribe to messages in the
> > inbox.
> > 
> > What I'd rather see is an API for queueing, and an API for inboxes
> > which integrates well with the queueing API. For instance, if a user
> > says "give me an inbox" I think Zaqar should return a queue handle for
> > sending into the inbox the same way Nova gives you a Neutron port if
> > you don't give it one. You might also ask for a queue to receive push
> > messages from the inbox. Point being, the queues are not the inbox,
> > and the inbox is not the queues.
> > 
> > However, if I just want a queue, just give me a queue. Don't store my
> > messages in a randomly addressable space, and don't saddle the deployer
> > with the burden of such storage. Put the queue API in front of a scalable
> > message queue and give me a nice simple HTTP API. Users would likely be
> > thrilled. Heat, Nova, Ceilometer, probab

Re: [openstack-dev] [tripleo][heat][ironic] Heat Ironic resources and "ready state" orchestration

2014-09-15 Thread Clint Byrum
Excerpts from Steven Hardy's message of 2014-09-15 10:10:05 -0700:
> On Mon, Sep 15, 2014 at 09:50:24AM -0700, Clint Byrum wrote:
> > Excerpts from Steven Hardy's message of 2014-09-15 04:44:24 -0700:
> > > All,
> > > 
> > > Starting this thread as a follow-up to a strongly negative reaction by the
> > > Ironic PTL to my patches[1] adding initial Heat->Ironic integration, and
> > > subsequent very detailed justification and discussion of why they may be
> > > useful in this spec[2].
> > > 
> > > Back in Atlanta, I had some discussions with folks interesting in making
> > > "ready state"[3] preparation of bare-metal resources possible when
> > > deploying bare-metal nodes via TripleO/Heat/Ironic.
> > > 
> > > The initial assumption is that there is some discovery step (either
> > > automatic or static generation of a manifest of nodes), that can be input
> > > to either Ironic or Heat.
> > > 
> > > Following discovery, but before an undercloud deploying OpenStack onto the
> > > nodes, there are a few steps which may be desired, to get the hardware 
> > > into
> > > a state where it's ready and fully optimized for the subsequent 
> > > deployment:
> > > 
> > > - Updating and aligning firmware to meet requirements of qualification or
> > >   site policy
> > > - Optimization of BIOS configuration to match workloads the node is
> > >   expected to run
> > > - Management of machine-local storage, e.g configuring local RAID for
> > >   optimal resilience or performance.
> > > 
> > > Interfaces to Ironic are landing (or have landed)[4][5][6] which make many
> > > of these steps possible, but there's no easy way to either encapsulate the
> > > (currently mostly vendor specific) data associated with each step, or to
> > > coordinate sequencing of the steps.
> > > 
> > 
> > First, Ironic is hidden under Nova as far as TripleO is concerned. So
> > mucking with the servers underneath Nova during deployment is a difficult
> > proposition. Would I look up the Ironic node ID of the nova server,
> > and then optimize it for the workload after the workload arrived? Why
> > wouldn't I just do that optimization before the deployment?
> 
> That's exactly what I'm proposing - a series of preparatory steps performed
> before the node is visible to nova, before the deployment.
> 

Ok good, so I didn't misunderstand. I'm having trouble seeing where Heat
is a good fit there.

> The whole point is that Ironic is hidden under nova, and provides no way to
> perform these pre-deploy steps via interaction with nova.
> 
> > 
> > > What is required is some tool to take a text definition of the required
> > > configuration, turn it into a correctly sequenced series of API calls to
> > > Ironic, expose any data associated with those API calls, and declare
> > > success or failure on completion.  This is what Heat does.
> > > 
> > 
> > I'd rather see Ironic define or adopt a narrow scope document format
> > that it can consume for bulk loading. Heat is extremely generic, and thus
> > carries a ton of complexity for what is probably doable with a CSV file.
> 
> Perhaps you can read the spec - it's not really about the bulk-load part,
> it's about orchestrating the steps to prepare the node, after it's
> registered with Ironic, but before it's ready to have the stuff deployed to
> it.
> 

Sounds like workflow to me. :-P

> What tool do you think will "just do that optimization before the
> deployment"? (snark not intended, I genuinely want to know, is it scripts
> in TripleO, some sysadmin pre-deploy steps, magic in Ironic?)
>

If it can all be done by calls to the ironic client with the node ID and
parameters from the user, I'd suggest that this is a simple workflow
and can be done in the step prior to 'heat stack-create'. I don't see
any reason to keep a bunch of records around in Heat to describe what
happened, identically, for Ironic nodes. It is an ephemeral step in the
evolution of the system, not something we need to edit on a regular basis.

My new bar for whether something is a good fit for Heat is what happens
to my workload when I update it. If I go into my Ironic pre-registration
stack and change things around, the likely case is that my box reboots
to re-apply BIOS updates with the new paramters. And there is a missing
dependency expression when using the orchestration tool to do the
workflow job. It may actually be necessary to always do these things to
the 

Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-15 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-09-15 12:05:09 -0700:
> On 15/09/14 13:28, Clint Byrum wrote:
> > Excerpts from Flavio Percoco's message of 2014-09-15 00:57:05 -0700:
> >> On 09/12/2014 07:13 PM, Clint Byrum wrote:
> >>> Excerpts from Thierry Carrez's message of 2014-09-12 02:16:42 -0700:
> >>>> Clint Byrum wrote:
> >>>>> Excerpts from Flavio Percoco's message of 2014-09-11 04:14:30 -0700:
> >>>>>> Is Zaqar being optimized as a *queuing* service? I'd say no. Our goal 
> >>>>>> is
> >>>>>> to optimize Zaqar for delivering messages and supporting different
> >>>>>> messaging patterns.
> >>>>>
> >>>>> Awesome! Just please don't expect people to get excited about it for
> >>>>> the lighter weight queueing workloads that you've claimed as use cases.
> >>>>>
> >>>>> I totally see Horizon using it to keep events for users. I see Heat
> >>>>> using it for stack events as well. I would bet that Trove would benefit
> >>>>> from being able to communicate messages to users.
> >>>>>
> >>>>> But I think in between Zaqar and the backends will likely be a lighter
> >>>>> weight queue-only service that the users can just subscribe to when they
> >>>>> don't want an inbox. And I think that lighter weight queue service is
> >>>>> far more important for OpenStack than the full blown random access
> >>>>> inbox.
> >>>>>
> >>>>> I think the reason such a thing has not appeared is because we were all
> >>>>> sort of running into "but Zaqar is already incubated". Now that we've
> >>>>> fleshed out the difference, I think those of us that need a lightweight
> >>>>> multi-tenant queue service should add it to OpenStack.  Separately. I 
> >>>>> hope
> >>>>> that doesn't offend you and the rest of the excellent Zaqar developers. 
> >>>>> It
> >>>>> is just a different thing.
> >>>>>
> >>>>>> Should we remove all the semantics that allow people to use Zaqar as a
> >>>>>> queue service? I don't think so either. Again, the semantics are there
> >>>>>> because Zaqar is using them to do its job. Whether other folks may/may
> >>>>>> not use Zaqar as a queue service is out of our control.
> >>>>>>
> >>>>>> This doesn't mean the project is broken.
> >>>>>
> >>>>> No, definitely not broken. It just isn't actually necessary for many of
> >>>>> the stated use cases.
> >>>>
> >>>> Clint,
> >>>>
> >>>> If I read you correctly, you're basically saying the Zaqar is overkill
> >>>> for a lot of people who only want a multi-tenant queue service. It's
> >>>> doing A+B. Why does that prevent people who only need A from using it ?
> >>>>
> >>>> Is it that it's actually not doing A well, from a user perspective ?
> >>>> Like the performance sucks, or it's missing a key primitive ?
> >>>>
> >>>> Is it that it's unnecessarily complex to deploy, from a deployer
> >>>> perspective, and that something only doing A would be simpler, while
> >>>> covering most of the use cases?
> >>>>
> >>>> Is it something else ?
> >>>>
> >>>> I want to make sure I understand your objection. In the "user
> >>>> perspective" it might make sense to pursue both options as separate
> >>>> projects. In the "deployer perspective" case, having a project doing A+B
> >>>> and a project doing A doesn't solve anything. So this affects the
> >>>> decision we have to take next Tuesday...
> >>>
> >>> I believe that Zaqar does two things, inbox semantics, and queue
> >>> semantics. I believe the queueing is a side-effect of needing some kind
> >>> of queue to enable users to store and subscribe to messages in the
> >>> inbox.
> >>>
> >>> What I'd rather see is an API for queueing, and an API for inboxes
> >>> which integrates well with the queueing API. For instance, if a user
> >>> says "give me an

Re: [openstack-dev] [Heat] Defining what is a SupportStatus version

2014-09-15 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-09-15 09:31:33 -0700:
> On 14/09/14 11:09, Clint Byrum wrote:
> > Excerpts from Gauvain Pocentek's message of 2014-09-04 22:29:05 -0700:
> >> Hi,
> >>
> >> A bit of background: I'm working on the publication of the HOT
> >> resources reference on docs.openstack.org. This book is mostly
> >> autogenerated from the heat source code, using the sphinx XML output. To
> >> avoid publishing several references (one per released version, as is
> >> done for the OpenStack config-reference), I'd like to add information
> >> about the support status of each resource (when they appeared, when
> >> they've been deprecated, and so on).
> >>
> >> So the plan is to use the SupportStatus class and its `version`
> >> attribute (see https://review.openstack.org/#/c/116443/ ). And the
> >> question is, what information should the version attribute hold?
> >> Possibilities include the release code name (Icehouse, Juno), or the
> >> release version (2014.1, 2014.2). But this wouldn't be useful for users
> >> of clouds continuously deployed.
> >>
> >>   From my documenter point of view, using the code name seems the right
> >> option, because it fits with the rest of the documentation.
> >>
> >> What do you think would be the best choice from the heat devs POV?
> >
> > What we ship in-tree is the standard library for Heat. I think Heat
> > should not tie things to the release of OpenStack, but only to itself.
> 
> "Standard Library" implies that everyone has it available, but in 
> reality operators can (and will, and do) deploy any combination of 
> resource types that they want.
> 

Mmk, I guess I was being too optimistic about how homogeneous OpenStack
clouds might be.

> > The idea is to simply version the standard library of resources separately
> > even from the language. Added resources and properties would be minor
> > bumps, deprecating or removing anything would be a major bump. Users then
> > just need an API call that allows querying the standard library version.
> 
> We already have API calls to actually inspect resource types. I don't 
> think a semantic version number is helpful here, since the different 
> existing combinations of resources types are not expressible linearly.
> 
> There's no really good answer here, but the only real answer is making 
> sure it's easy for people to generate the docs themselves for their 
> actual deployment.
> 

That's an interesting idea. By any chance do we have something that
publishes the docs directly from source tree into swift? Might make it
easier if we could just do that as part of code pushes for those who run
clouds from source.

> > With this scheme, we can provide a gate test that prevents breaking the
> > rules, and automatically generate the docs still. Doing this would sync
> > better with continuous deployers who will be running "Juno" well before
> > there is a "2014.2".
> 
> Maybe continuous deployers should continuously deploy their own docs? 
> For any given cloud the only thing that matters is what it supports 
> right now.
>

Thats an interesting idea, but I like what the user wants is to see how
this cloud is different than the other clouds.

> > Anyway, Heat largely exists to support portability of apps between
> > OpenStack clouds. Many many OpenStack clouds don't run one release,
> > and we don't require them to do so. So tying to the release is, IMO,
> > a poor coice.
> 
> The original question was about docs.openstack.org, and in that context 
> I think tying it to the release version is a good choice, because 
> that's... how OpenStack is released. Individual clouds, however, really 
> need to deploy their own docs that document what they actually support.
> 

Yeah I hadn't thought of that before. I like the idea but I wonder how
practical it is for CD private clouds.

> The flip side of this, of course, is that whatever we use for the 
> version strings on docs.openstack.org will all make its way into all the 
> other documentation that gets built, and I do understand your point in 
> that context. But versioning the "standard library" of plugins as if it 
> were a monolithic, always-available thing seems wrong to me.
>

Yeah I think it is too optimistic in retrospect.

> > We do the same thing with HOT's internals, so why not also
> > do the standard library this way?
> 
> The current process for HOT is for every OpenStack development cycle 
> (Juno is the first to use this) to give it a 'ver

Re: [openstack-dev] [glance][all] Help with interpreting the log level guidelines

2014-09-15 Thread Clint Byrum
Excerpts from Sean Dague's message of 2014-09-15 16:02:04 -0700:
> On 09/15/2014 07:00 PM, Mark Washenberger wrote:
> > Hi there logging experts,
> > 
> > We've recently had a little disagreement in the glance team about the
> > appropriate log levels for http requests that end up failing due to user
> > errors. An example would be a request to get an image that does not
> > exist, which results in a 404 Not Found request.
> > 
> > On one hand, this event is an error, so DEBUG or INFO seem a little too
> > low. On the other hand, this error doesn't generally require any kind of
> > operator investigation or indicate any actual failure of the service, so
> > perhaps it is excessive to log it at WARN or ERROR.
> > 
> > Please provide feedback to help us resolve this dispute if you feel you can!
> 
> My feeling is this is an INFO level. There is really nothing the admin
> should care about here.

Agree with Sean. INFO are useful for investigations. WARN and ERROR are
cause for alarm.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

2014-09-17 Thread Clint Byrum
This is where Debian's "one urllib3 to rule them all" model fails in
a modern fast paced world. Debian is arguably doing the right thing by
pushing everyone to use one API, and one library, so that when that one
library is found to be vulnerable to security problems, one update covers
everyone. Also, this is an HTTP/HTTPS library.. so nobody can make the
argument that security isn't paramount in this context.

But we all know that the "app store" model has started to bleed down into
backend applications, and now you just ship the virtualenv or docker
container that has your app as you tested it, and if that means you're
20 versions behind on urllib3, that's your problem, not the OS vendor's.

I think it is _completely_ irresponsible of requests, a library, to
embed another library. But I don't know if we can avoid making use of
it if we are going to be exposed to objects that are attached to it.

Anyway, Thomas, if you're going to send the mob with pitchforks and
torches somewhere, I'd say send them to wherever requests makes its
home. OpenStack is just buying their mutated product.

Excerpts from Donald Stufft's message of 2014-09-17 08:22:48 -0700:
> Looking at the code on my phone it looks completely correct to use the 
> vendored copy here and it wouldn't actually work otherwise. 
> 
> > On Sep 17, 2014, at 11:17 AM, Donald Stufft  wrote:
> > 
> > I don't know the specific situation but it's appropriate to do this if 
> > you're using requests and wish to interact with the urllib3 that requests 
> > is using.
> > 
> >> On Sep 17, 2014, at 11:15 AM, Thomas Goirand  wrote:
> >> 
> >> Hi,
> >> 
> >> I'm horrified by what I just found. I have just found out this in
> >> glanceclient:
> >> 
> >> File "/tests/test_ssl.py", line 19, in 
> >>   from requests.packages.urllib3 import poolmanager
> >> ImportError: No module named packages.urllib3
> >> 
> >> Please *DO NOT* do this. Instead, please use urllib3 from ... urllib3.
> >> Not from requests. The fact that requests is embedding its own version
> >> of urllib3 is an heresy. In Debian, the embedded version of urllib3 is
> >> removed from requests.
> >> 
> >> In Debian, we spend a lot of time to "un-vendorize" stuff, because
> >> that's a security nightmare. I don't want to have to patch all of
> >> OpenStack to do it there as well.
> >> 
> >> And no, there's no good excuse here...
> >> 
> >> Thomas Goirand (zigo)
> >> 
> >> ___
> >> OpenStack-dev mailing list
> >> OpenStack-dev@lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

2014-09-17 Thread Clint Byrum
Excerpts from Davanum Srinivas's message of 2014-09-17 10:15:29 -0700:
> I was trying request-ifying oslo.vmware and ran into this as well:
> https://review.openstack.org/#/c/121956/
> 
> And we don't seem to have urllib3 in global-requirements either.
> Should we do that first?

Honestly, after reading this:

https://github.com/kennethreitz/requests/pull/1812

I think we might want to consider requests a poor option. Its author
clearly doesn't understand the role a _library_ plays in software
development and considers requests an application, not a library.

For instance, why is requests exposing internal implementation details
at all?  It should be wrapping any exceptions or objects to avoid
forcing users to make this choice at all.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][nova] VM restarting on host failure in convergence

2014-09-17 Thread Clint Byrum
Excerpts from Jastrzebski, Michal's message of 2014-09-17 06:03:06 -0700:
> All,
> 
> Currently OpenStack does not have a built-in HA mechanism for tenant
> instances which could restore virtual machines in case of a host
> failure. Openstack assumes every app is designed for failure and can
> handle instance failure and will self-remediate, but that is rarely
> the case for the very large Enterprise application ecosystem.
> Many existing enterprise applications are stateful, and assume that
> the physical infrastructure is always on.
> 

There is a fundamental debate that OpenStack's vendors need to work out
here. Existing applications are well served by existing virtualization
platforms. Turning OpenStack into a work-alike to oVirt is not the end
goal here. It's a happy accident that traditional apps can sometimes be
bent onto the cloud without much modification.

The thing that clouds do is they give development teams a _limited_
infrastructure that lets IT do what they're good at (keep the
infrastructure up) and lets development teams do what they're good at (run
their app). By putting HA into the _app_, and not the _infrastructure_,
the dev teams get agility and scalability. No more waiting weeks for
allocationg specialized servers with hardware fencing setups and fibre
channel controllers to house a shared disk system so the super reliable
virtualization can hide HA from the user.

Spin up vms. Spin up volumes.  Run some replication between regions,
and be resilient.

So, as long as it is understood that whatever is being proposed should
be an application centric feature, and not an infrastructure centric
feature, this argument remains interesting in the "cloud" context.
Otherwise, it is just an invitation for OpenStack to open up direct
competition with behemoths like vCenter.

> Even the OpenStack controller services themselves do not gracefully
> handle failure.
> 

Which ones?

> When these applications were virtualized, they were virtualized on
> platforms that enabled very high SLAs for each virtual machine,
> allowing the application to not be rewritten as the IT team moved them
> from physical to virtual. Now while these apps cannot benefit from
> methods like automatic scaleout, the application owners will greatly
> benefit from the self-service capabilities they will recieve as they
> utilize the OpenStack control plane.
> 

These apps were virtualized for IT's benefit. But the application authors
and users are now stuck in high-cost virtualization. The cloud is best
utilized when IT can control that cost and shift the burden of uptime
to the users by offering them more overall capacity and flexibility with
the caveat that the individual resources will not be as reliable.

So what I'm most interested in is helping authors change their apps to
be reslient on their own, not in putting more burden on IT.

> I'd like to suggest to expand heat convergence mechanism to enable
> self-remediation of virtual machines and other heat resources.
> 

Convergence is still nascent. I don't know if I'd pile on to what might
take another 12 - 18 months to get done anyway. We're just now figuring
out how to get started where we thought we might already be 1/3 of the
way through. Just something to consider.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Adding hp1 back running tripleo CI

2014-09-17 Thread Clint Byrum
Excerpts from Derek Higgins's message of 2014-09-17 06:53:25 -0700:
> On 15/09/14 22:37, Gregory Haynes wrote:
> > This is a total shot in the dark, but a couple of us ran into issues
> > with the Ubuntu Trusty kernel (I know I hit it on HP hardware) that was
> > causing severely degraded performance for TripleO. This fixed with a
> > recently released kernel in Trusty... maybe you could be running into
> > this?
> 
> thanks Greg,
> 
> To try this out, I've redeployed the new testenv image and ran 35
> overcloud jobs on it(32 passed), the average time for these was 130
> minutes so unfortunately no major difference.
> 
> The old kernel was
> 3.13.0-33-generic #58-Ubuntu SMP Tue Jul 29 16:45:05 UTC 2014 x86_64

This kernel definitely had the kvm bugs Greg and I exprienced in the
past

> the one one is
> 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64
> 

Darn. This one does not. Is it possible the hardware is just less
powerful?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

2014-09-17 Thread Clint Byrum
Excerpts from Ian Cordasco's message of 2014-09-17 12:42:57 -0700:
> On 9/17/14, 1:46 PM, "Clint Byrum"  wrote:
> 
> >Excerpts from Davanum Srinivas's message of 2014-09-17 10:15:29 -0700:
> >> I was trying request-ifying oslo.vmware and ran into this as well:
> >> https://review.openstack.org/#/c/121956/
> >> 
> >> And we don't seem to have urllib3 in global-requirements either.
> >> Should we do that first?
> >
> >Honestly, after reading this:
> >
> >https://github.com/kennethreitz/requests/pull/1812
> >
> >I think we might want to consider requests a poor option. Its author
> >clearly doesn't understand the role a _library_ plays in software
> >development and considers requests an application, not a library.
> 
> Yes that is Kenneth’s opinion. That is not the opinion of the core
> developers though. We see it as a library but this is something we aren’t
> going to currently change any time soon.
> 

Good to know but troubling to hear that there's no change in sight.

> >For instance, why is requests exposing internal implementation details
> >at all?
> 
> Where exactly are we exposing internal implementation details? A normal
> user (even advanced users) can use requests without ever digging into
> requests.packages. What implementation details are we exposing and where?
> 
> >It should be wrapping any exceptions or objects to avoid
> >forcing users to make this choice at all.
> 
> We do. Occasionally (like in 2.4.0) urllib3 adds an exception that we
> missed notice of and it slips through. We released 2.4.1 a couple days
> later with the fix for that. Pretty much every error we’ve seen or know
> about is caught and rewrapped as a requests exception. I’m not sure what
> you’re arguing here, unless of course you have not used requests.
> 

I had seen handling of urllib3 exceptions in some requests code and was
seeing a few issues about that. I understand now that it is not
intentional, but a side effect of misunderstanding.

> That aside, I’ve been mulling over how effectively the clients use
> requests. I haven’t investigated all of them, but many seem to reach into
> implementation details on their own. If I remember nova client has
> something it has commented as “connection pooling” while requests and
> urllib3 do that automatically. I haven’t started to investigate exactly
> why they do this. Likewise, glance client has custom certificate
> verification in glanceclient.common.https. Why? I’m not exactly certain
> yet. It seems for the most part from what little I’ve seen that requests
> is too high-level a library for OpenStack’s needs at best, and actively
> obscures details OpenStack developers need (or don’t realize requests
> provides in most cases).
> 

Indeed, it sounds like there has been some misuse of requests that led
to the situation.

> Circling back to the issue of vendoring though: it’s a conscious decision
> to do this, and in the last two years there have been 2 CVEs reported for
> requests. There have been none for urllib3 and none for chardet. (Frankly
> I don’t think either urllib3 or chardet have had any CVEs reported against
> them, but let’s ignore that for now.) While security is typically the
> chief concern with vendoring, none of the libraries we use have had
> security issues rendering it a moot point in my opinion. The benefits of
> vendoring for us as a team have been numerous and we will likely continue
> to do it until it stops benefiting us and our users.
> 

On a micro scale, you are absolutely correct. The point is not that
there is a tactical reason for urllib3 to be de-vendored, but rather
that there is a strategic reason to not use vendored libraries. When
updates are necessary, they must be complete, and rapid. Please refer
to the zlib travesty to see why everyone, not just distribution vendors,
should be extremely concerned about embedded libraries.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

2014-09-18 Thread Clint Byrum
Excerpts from Donald Stufft's message of 2014-09-18 04:58:06 -0700:
> 
> > On Sep 18, 2014, at 7:54 AM, Thomas Goirand  wrote:
> > 
> >> 
> >> Linux distributions are not the end be all of distribution models and
> >> they don’t get to dictate to upstream.
> > 
> > Well, distributions is where the final user is, and where software gets
> > consumed. Our priority should be the end users.
> 
> 
> Distributions are not the only place that people get their software from,
> unless you think that the ~3 million downloads requests has received
> on PyPI in the last 30 days are distributions downloading requests to
> package in their OSs.
> 

Do pypi users not also need to be able to detect and fix any versions
of libraries they might have? If one has some virtualenvs with various
libraries and apps installed and no --system-site-packages, one would
probably still want to run 'pip freeze' in all of them and find out what
libraries are there and need to be fixed.

Anyway, generally security updates require a comprehensive strategy.
One common comprehensive strategy is version assertion.

Vendoring complicates that immensely.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Zaqar and SQS Properties of Distributed Queues

2014-09-18 Thread Clint Byrum
Great job highlighting what our friends over at Amazon are doing.

It's clear from these snippets, and a few other pieces of documentation
for SQS I've read, that the Amazon team approached SQS from a _massive_
scaling perspective. I think what may be forcing a lot of this frustration
with Zaqar is that it was designed with a much smaller scale in mind.

I think as long as that is the case, the design will remain in question.
I'd be comfortable saying that the use cases I've been thinking about
are entirely fine with the limitations SQS has.

Excerpts from Joe Gordon's message of 2014-09-17 13:36:18 -0700:
> Hi All,
> 
> My understanding of Zaqar is that it's like SQS. SQS uses distributed
> queues, which have a few unusual properties [0]:
> Message Order
> 
> Amazon SQS makes a best effort to preserve order in messages, but due to
> the distributed nature of the queue, we cannot guarantee you will receive
> messages in the exact order you sent them. If your system requires that
> order be preserved, we recommend you place sequencing information in each
> message so you can reorder the messages upon receipt.
> At-Least-Once Delivery
> 
> Amazon SQS stores copies of your messages on multiple servers for
> redundancy and high availability. On rare occasions, one of the servers
> storing a copy of a message might be unavailable when you receive or delete
> the message. If that occurs, the copy of the message will not be deleted on
> that unavailable server, and you might get that message copy again when you
> receive messages. Because of this, you must design your application to be
> idempotent (i.e., it must not be adversely affected if it processes the
> same message more than once).
> Message Sample
> 
> The behavior of retrieving messages from the queue depends whether you are
> using short (standard) polling, the default behavior, or long polling. For
> more information about long polling, see Amazon SQS Long Polling
> 
> .
> 
> With short polling, when you retrieve messages from the queue, Amazon SQS
> samples a subset of the servers (based on a weighted random distribution)
> and returns messages from just those servers. This means that a particular
> receive request might not return all your messages. Or, if you have a small
> number of messages in your queue (less than 1000), it means a particular
> request might not return any of your messages, whereas a subsequent request
> will. If you keep retrieving from your queues, Amazon SQS will sample all
> of the servers, and you will receive all of your messages.
> 
> The following figure shows short polling behavior of messages being
> returned after one of your system components makes a receive request.
> Amazon SQS samples several of the servers (in gray) and returns the
> messages from those servers (Message A, C, D, and B). Message E is not
> returned to this particular request, but it would be returned to a
> subsequent request.
> 
> 
> 
> Presumably SQS has these properties because it makes the system scalable,
> if so does Zaqar have the same properties (not just making these same
> guarantees in the API, but actually having these properties in the
> backends)? And if not, why? I looked on the wiki [1] for information on
> this, but couldn't find anything.
> 
> 
> 
> 
> 
> [0]
> http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/DistributedQueues.html
> [1] https://wiki.openstack.org/wiki/Zaqar

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

2014-09-18 Thread Clint Byrum
Excerpts from Donald Stufft's message of 2014-09-18 07:30:27 -0700:
> 
> > On Sep 18, 2014, at 10:18 AM, Clint Byrum  wrote:
> > 
> > Excerpts from Donald Stufft's message of 2014-09-18 04:58:06 -0700:
> >> 
> >>> On Sep 18, 2014, at 7:54 AM, Thomas Goirand  wrote:
> >>> 
> >>>> 
> >>>> Linux distributions are not the end be all of distribution models and
> >>>> they don’t get to dictate to upstream.
> >>> 
> >>> Well, distributions is where the final user is, and where software gets
> >>> consumed. Our priority should be the end users.
> >> 
> >> 
> >> Distributions are not the only place that people get their software from,
> >> unless you think that the ~3 million downloads requests has received
> >> on PyPI in the last 30 days are distributions downloading requests to
> >> package in their OSs.
> >> 
> > 
> > Do pypi users not also need to be able to detect and fix any versions
> > of libraries they might have? If one has some virtualenvs with various
> > libraries and apps installed and no --system-site-packages, one would
> > probably still want to run 'pip freeze' in all of them and find out what
> > libraries are there and need to be fixed.
> > 
> > Anyway, generally security updates require a comprehensive strategy.
> > One common comprehensive strategy is version assertion.
> > 
> > Vendoring complicates that immensely.
> 
> It doesn’t really matter. PyPI doesn’t dictate to projects who host there what
> that project is allowed to do except in some very broad circumstances. Whether
> or not requests *should* do this doesn't really have any bearing on what
> Openstack should do to cope with it. The facts are that requests does it, and
> that people pulling things from PyPI is an actual platform that needs thought
> about.
> 
> This leaves Openstack with a few reasonable/sane options:
> 
> 1) Decide that vendoring in requests is unacceptable to what Openstack as a
>project is willing to support, and cease the use of requests.
> 2) Decide that what requests offers is good enough that it outweighs the fact
>that it vendors urllib3 and continue using it.
> 

There's also 3) fork requests, which is the democratic way to vote out
an upstream that isn't supporting the needs of the masses.

I don't think we're anywhere near there, but I wanted to make it clear
there _is_ a more extreme option.

> If the 2nd option is chosen, then doing anything but supporting the fact that
> requests vendors urllib3 within the code that openstack writes is hurting the
> users who fetch these projects from PyPI because you don't agree with one of
> the choices that requests makes. By all means do conditional imports to lessen
> the impact that the choice requests has made (and the one that Openstack has
> made to use requests) on downstream distributors, but unconditionally 
> importing
> from the top level urllib3 for use within requests is flat out wrong.
> 
> Obviously neither of these options excludes the choice to lean on requests to
> reverse this decision as well. However that is best done elsewhere as the
> person making that decision isn't a member of these mailing lists as far as
> I am aware.
> 

To be clear, I think we should keep using requests. But we should lend
our influence upstream and explain that our users are required to deal
with this in a way that perhaps hasn't been considered or given the
appropriate priority.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

2014-09-18 Thread Clint Byrum
Excerpts from Ian Cordasco's message of 2014-09-18 07:35:10 -0700:
> On 9/18/14, 9:18 AM, "Clint Byrum"  wrote:
> 
> >Excerpts from Donald Stufft's message of 2014-09-18 04:58:06 -0700:
> >> 
> >> > On Sep 18, 2014, at 7:54 AM, Thomas Goirand  wrote:
> >> > 
> >> >> 
> >> >> Linux distributions are not the end be all of distribution models and
> >> >> they don’t get to dictate to upstream.
> >> > 
> >> > Well, distributions is where the final user is, and where software
> >>gets
> >> > consumed. Our priority should be the end users.
> >> 
> >> 
> >> Distributions are not the only place that people get their software
> >>from,
> >> unless you think that the ~3 million downloads requests has received
> >> on PyPI in the last 30 days are distributions downloading requests to
> >> package in their OSs.
> >> 
> >
> >Do pypi users not also need to be able to detect and fix any versions
> >of libraries they might have? If one has some virtualenvs with various
> >libraries and apps installed and no --system-site-packages, one would
> >probably still want to run 'pip freeze' in all of them and find out what
> >libraries are there and need to be fixed.
> >
> >Anyway, generally security updates require a comprehensive strategy.
> >One common comprehensive strategy is version assertion.
> >
> >Vendoring complicates that immensely.
> 
> Except that even OpenStack doesn’t pin requests because of how
> extraordinarily stable our API is. While you can argue that Kenneth has
> non-standard opinions about his library, Cory and I take backwards
> compatibility and stability very seriously. This means anyone can upgrade
> to a newer version of requests without worrying that it will be backwards
> incompatible. 
> 

All of your hard work is very much appreciated. I don't understand what
your assertion means though. We don't pin things. However, our users end
up "pinning" when they install via pip, and our distros end up "pinning"
when they deliver a version. Without any indication that urllib3 is in
the system, they will fail at any cursory version audit that looks for it.

I'm not saying either way is right or wrong either.. I'm suggesting
that this is a valid, proven method for large scale risk assessment,
and it is complicated quite a bit by vendored libraries.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Clint Byrum
Excerpts from Christopher Yeoh's message of 2014-09-18 16:57:12 -0700:
> On Thu, 18 Sep 2014 12:12:28 -0400
> Sean Dague  wrote:
> > > When we can return the json-schema to user in the future, can we say
> > > that means API accepting utf8 or utf8mb4 is discoverable? If it is
> > > discoverable, then we needn't limit anything in our python code.
> > 
> > Honestly, we should accept utf8 (no weird mysqlism not quite utf8). We
> > should make the default scheme for our dbs support that on names (but
> > only for the name columns). The failure of a backend to do utf8 for
> > real should return an error to the user. Let's not make this more
> > complicated than it needs to be.
> 
> I agree that discoverability for this is not the way to go - I think its
> too complicated for end users. I don't know enough about mysql to know
> if utf8mb4 is going to a performance issue but if its not then we
> should just support utf-8 properly. 
> 
> We can we can catch the db errors. However whilst converting db
> errors causing 500s is fairly straightforward when an error occurs that
> deep in Nova it also means a lot of potential unwinding work in the db
> and compute layers which is complicated and error prone. So i'd prefer
> to avoid the situation with input validation in the first place. 

Just to add a reference into the discussion:

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

It does have the same limitation of making fixed width keys and CHAR()
columns. It goes from 3 bytes per CHAR position, to 4, so it should not
be a database wide default, but something we use sparingly.

Note that the right answer for things that are not utf-8 (like UUID's)
is not to set a charset of latin1, but use BINARY/VARBINARY. Last
time I tried I had a difficult time coercing SQLAlchemy to model the
difference.. but maybe I just didn't look in the right part of the manual.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Continue discussing multi-region orchestration

2013-11-19 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2013-11-15 12:41:53 -0800:
> Good news, everyone! I have created the missing whiteboard diagram that 
> we all needed at the design summit:
> 
> https://wiki.openstack.org/wiki/Heat/Blueprints/Multi_Region_Support_for_Heat/The_Missing_Diagram
> 
> I've documented 5 possibilities. (1) is the current implementation, 
> which we agree we want to get away from. I strongly favour (2) for the 
> reasons listed. I don't think (3) has many friends. (4) seems to be 
> popular despite the obvious availability problem and doubts that it is 
> even feasible. Finally, I can save us all some time by simply stating 
> that I will -2 on sight any attempt to implement (5).
> 
> When we're discussing this, please mention explicitly the number of the 
> model you are talking about at any given time.
> 
> If you have a suggestion for a different model, make your own diagram! 
> jk, you can sketch it or something for me and I'll see if I can add it.

Thanks for putting this together Zane. I just now got around to looking
closely.

Option 2 is good. I'd love for option 1 to be made automatic by making
the client smarter, but parsing templates in the client will require
some deep thought before we decide it is a good idea.

I'd like to consider a 2a, which just has the same Heat engines the user
is talking to being used to do the orchestration in whatever region
they are in. I think that is actually the intention of the diagram,
but it looks like there is a "special" one that talks to the engines
that actually do the work.

2 may morph into 3 actually, if users don't like the nested stack
requirement for 2, we can do the work to basically make the engine create
a nested stack per region. So that makes 2 a stronger choice for first
implementation.

4 has an unstated pro, which is that attack surface is reduced. This
makes more sense when you consider the TripleO case where you may want
the undercloud (hardware cloud) to orchestrate things existing in the
overcloud (vm cloud) but you don't want the overcloud administrators to
be able to control your entire stack.

Given CAP theorem, option 5, the global orchestrator, would be doable
with not much change as long as partition tolerance were the bit we gave
up. We would just have to have a cross-region RPC bus and database. Of
course, since regions are most likely to be partitioned, that is not
really a good choice. Trading partition tolerance for consistency lands
us in the complexity black hole. Trading out availability makes it no
better than option 4.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Clint Byrum
Excerpts from Chris Friesen's message of 2013-11-19 09:29:00 -0800:
> On 11/18/2013 06:47 PM, Joshua Harlow wrote:
> > An idea related to this, what would need to be done to make the DB have
> > the exact state that a compute node is going through (and therefore the
> > scheduler would not make unreliable/racey decisions, even when there are
> > multiple schedulers). It's not like we are dealing with a system which
> > can not know the exact state (as long as the compute nodes are connected
> > to the network, and a network partition does not occur).
> 
> How would you synchronize the various schedulers with each other? 
> Suppose you have multiple scheduler nodes all trying to boot multiple 
> instances each.
> 
> Even if each at the start of the process each scheduler has a perfect 
> view of the system, each scheduler would need to have a view of what 
> every other scheduler is doing in order to not make racy decisions.
> 

Your question assumes they need to be "in sync" at a granular level.

Each scheduler process can own a different set of resources. If they
each grab instance requests in a round-robin fashion, then they will
fill their resources up in a relatively well balanced way until one
scheduler's resources are exhausted. At that time it should bow out of
taking new instances. If it can't fit a request in, it should kick the
request out for retry on another scheduler.

In this way, they only need to be in sync in that they need a way to
agree on who owns which resources. A distributed hash table that gets
refreshed whenever schedulers come and go would be fine for that.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Clint Byrum
Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
> On 11/19/2013 12:35 PM, Clint Byrum wrote:
> 
> > Each scheduler process can own a different set of resources. If they
> > each grab instance requests in a round-robin fashion, then they will
> > fill their resources up in a relatively well balanced way until one
> > scheduler's resources are exhausted. At that time it should bow out of
> > taking new instances. If it can't fit a request in, it should kick the
> > request out for retry on another scheduler.
> >
> > In this way, they only need to be in sync in that they need a way to
> > agree on who owns which resources. A distributed hash table that gets
> > refreshed whenever schedulers come and go would be fine for that.
> 
> That has some potential, but at high occupancy you could end up refusing 
> to schedule something because no one scheduler has sufficient resources 
> even if the cluster as a whole does.
> 

I'm not sure what you mean here. What resource spans multiple compute
hosts?

> This gets worse once you start factoring in things like heat and 
> instance groups that will want to schedule whole sets of resources 
> (instances, IP addresses, network links, cinder volumes, etc.) at once 
> with constraints on where they can be placed relative to each other.
> 

Actually that is rather simple. Such requests have to be serialized
into a work-flow. So if you say "give me 2 instances in 2 different
locations" then you allocate 1 instance, and then another one with
'not_in_location(1)' as a condition.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT software configuration refined after design summit discussions

2013-11-19 Thread Clint Byrum
Excerpts from Steve Baker's message of 2013-11-18 12:52:04 -0800:
> 
> Regarding apply_config/remove_config, if a SoftwareApplier resource is
> deleted it should trigger any remove_config and wait for the server to
> acknowledge when that is complete. This allows for any
> evacuation/deregistering workloads to be executed.
> 

I'm a little worried about the road that leads us down. Most configuration
software defines forward progress only. Meaning, if you want something
not there, you don't remove it from your assertions, you assert that it
is not there.

The reason this is different than the way we operate with resources is
that resources are all under Heat's direct control via well defined
APIs. In-instance things, however, will be indirectly controlled. So I
feel like focusing on a "diff" mechanism for user-deployed tools may be
unnecessary and might confuse. I'd much rather have a "converge"
mechanism for the users to focus on.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT software configuration refined after design summit discussions

2013-11-19 Thread Clint Byrum
Excerpts from Steve Baker's message of 2013-11-19 13:06:21 -0800:
> On 11/20/2013 09:50 AM, Clint Byrum wrote:
> > Excerpts from Steve Baker's message of 2013-11-18 12:52:04 -0800:
> >> Regarding apply_config/remove_config, if a SoftwareApplier resource is
> >> deleted it should trigger any remove_config and wait for the server to
> >> acknowledge when that is complete. This allows for any
> >> evacuation/deregistering workloads to be executed.
> >>
> > I'm a little worried about the road that leads us down. Most configuration
> > software defines forward progress only. Meaning, if you want something
> > not there, you don't remove it from your assertions, you assert that it
> > is not there.
> >
> > The reason this is different than the way we operate with resources is
> > that resources are all under Heat's direct control via well defined
> > APIs. In-instance things, however, will be indirectly controlled. So I
> > feel like focusing on a "diff" mechanism for user-deployed tools may be
> > unnecessary and might confuse. I'd much rather have a "converge"
> > mechanism for the users to focus on.
> >
> >
> A specific use-case I'm trying to address here is tripleo doing an
> update-replace on a nova compute node. The remove_config contains the
> workload to evacuate VMs and signal heat when the node is ready to be
> shut down. This is more involved than just "uninstall the things".
> 
> Could you outline in some more detail how you think this could be done?
> 

So for that we would not remove the software configuration for the
nova-compute, we would assert that the machine needs vms evacuated.
We want evacuation to be something we explicitly do, not a side effect
of deleting things. Perhaps having delete hooks for starting delete
work-flows is right, but it set off a red flag for me so I want to make
sure we think it through.

Also IIRC, evacuation is not necessarily an in-instance thing. It looks
more like the weird thing we've been talking about lately which is
"how do we orchestrate tenant API's":

https://etherpad.openstack.org/p/orchestrate-tenant-apis

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Clint Byrum
Excerpts from Chris Friesen's message of 2013-11-19 12:18:16 -0800:
> On 11/19/2013 01:51 PM, Clint Byrum wrote:
> > Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
> >> On 11/19/2013 12:35 PM, Clint Byrum wrote:
> >>
> >>> Each scheduler process can own a different set of resources. If they
> >>> each grab instance requests in a round-robin fashion, then they will
> >>> fill their resources up in a relatively well balanced way until one
> >>> scheduler's resources are exhausted. At that time it should bow out of
> >>> taking new instances. If it can't fit a request in, it should kick the
> >>> request out for retry on another scheduler.
> >>>
> >>> In this way, they only need to be in sync in that they need a way to
> >>> agree on who owns which resources. A distributed hash table that gets
> >>> refreshed whenever schedulers come and go would be fine for that.
> >>
> >> That has some potential, but at high occupancy you could end up refusing
> >> to schedule something because no one scheduler has sufficient resources
> >> even if the cluster as a whole does.
> >>
> >
> > I'm not sure what you mean here. What resource spans multiple compute
> > hosts?
> 
> Imagine the cluster is running close to full occupancy, each scheduler 
> has room for 40 more instances.  Now I come along and issue a single 
> request to boot 50 instances.  The cluster has room for that, but none 
> of the schedulers do.
> 

You're assuming that all 50 come in at once. That is only one use case
and not at all the most common.

> >> This gets worse once you start factoring in things like heat and
> >> instance groups that will want to schedule whole sets of resources
> >> (instances, IP addresses, network links, cinder volumes, etc.) at once
> >> with constraints on where they can be placed relative to each other.
> 
> > Actually that is rather simple. Such requests have to be serialized
> > into a work-flow. So if you say "give me 2 instances in 2 different
> > locations" then you allocate 1 instance, and then another one with
> > 'not_in_location(1)' as a condition.
> 
> Actually, you don't want to serialize it, you want to hand the whole set 
> of resource requests and constraints to the scheduler all at once.
> 
> If you do them one at a time, then early decisions made with 
> less-than-complete knowledge can result in later scheduling requests 
> failing due to being unable to meet constraints, even if there are 
> actually sufficient resources in the cluster.
> 
> The "VM ensembles" document at
> https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4UTwsmhw/edit?pli=1
>  
> has a good example of how one-at-a-time scheduling can cause spurious 
> failures.
> 
> And if you're handing the whole set of requests to a scheduler all at 
> once, then you want the scheduler to have access to as many resources as 
> possible so that it has the highest likelihood of being able to satisfy 
> the request given the constraints.

This use case is real and valid, which is why I think there is room for
multiple approaches. For instance the situation you describe can also be
dealt with by just having the cloud stay under-utilized and accepting
that when you get over a certain percentage utilized spurious failures
will happen. We have a similar solution in the ext3 filesystem on Linux.
Don't fill it up, or suffer a huge performance penalty.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New API requirements, review of GCE

2013-11-19 Thread Clint Byrum
Excerpts from Robert Collins's message of 2013-11-19 16:22:41 -0800:
> On 20 November 2013 13:00, Sean Dague  wrote:
> >> As long as the metadataservice doesn't move out :) - that one I think
> >> is pretty core and we have no native replacement [configdrive is not a
> >> replacement :P].
> >
> >
> > Slightly off tangent thread.
> >
> > So we recently moved devstack gate to do con fig drive instead of metadata
> > service, and life was good (no one really noticed). In what ways is
> > configdrive insufficient compared to metadata service? And is that something
> > that we should be tackling?
> 
> * The metadata service can be trivially updated live - and Heat wants
> to use this to get rid of it's own metadata service... whereas config
> drive requires unplugging the device, updating the data and replugging
> - and thats a bit more invasive.
> 

This one is key. Both Trove and Savanna have run into the same
limitation as Heat: networks that cannot reach the API endpoints don't
get to have their API specific Metadata that updates over time. By
putting it in the EC2 metadata service, we can access it via the Neutron
proxy and then Heat/Trove/Savanna can update it later to provide a
control bus for in-instance tools.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT software configuration refined after design summit discussions

2013-11-20 Thread Clint Byrum
Excerpts from Thomas Spatzier's message of 2013-11-19 23:35:40 -0800:
> Excerpts from Steve Baker's message on 19.11.2013 21:40:54:
> > From: Steve Baker 
> > To: openstack-dev@lists.openstack.org,
> > Date: 19.11.2013 21:43
> > Subject: Re: [openstack-dev] [Heat] HOT software configuration
> > refined after design summit discussions
> >
> 
> > I think there needs to a CM tool specific agent delivered to the server
> > which os-collect-config invokes. This agent will transform the config
> > data (input values, CM script, CM specific specialness) to a CM tool
> > invocation.
> >
> > How to define and deliver this agent is the challenge. Some options are:
> > 1) install it as part of the image customization/bootstrapping (golden
> > images or cloud-init)
> > 2) define a (mustache?) template in the SoftwareConfig which
> > os-collect-config transforms into the agent script, which
> > os-collect-config then executes
> > 3) a CM tool specific implementation of SoftwareApplier builds and
> > delivers a complete agent to os-collect-config which executes it
> >
> > I may be leaning towards 3) at the moment. Hopefully any agent can be
> > generated with a sufficiently sophisticated base SoftwareApplier type,
> > plus maybe some richer intrinsic functions.
> 
> This is good summary of options; about the same we had in mind. And we were
> also leaning towards 3. Probably the approach we would take is to get a
> SoftwareApplier running for one CM tool (e.g. Chef), then look at another
> tool (base shell scripts), and then see what the generic parts art that can
> be factored into a base class.
> 
> > >> The POC I'm working on is actually backed by a REST API which does
> dumb
> > >> (but structured) storage of SoftwareConfig and SoftwareApplier
> entities.
> > >> This has some interesting implications for managing SoftwareConfig
> > >> resources outside the context of the stack which uses them, but lets
> not
> > >> worry too much about that *yet*.
> > > Sounds good. We are also defining some blueprints to break down the
> overall
> > > software config topic. We plan to share them later this week, and then
> we
> > > can consolidate with your plans and see how we can best join forces.
> > >
> > >
> > At this point it would be very helpful to spec out how specific CM tools
> > are invoked with given inputs, script, and CM tool specific options.
> 
> That's our plan; and we would probably start with scripts and chef.
> 
> >
> > Maybe if you start with shell scripts, cfn-init and chef then we can all
> > contribute other CM tools like os-config-applier, puppet, ansible,
> > saltstack.
> >
> > Hopefully by then my POC will at least be able to create resources, if
> > not deliver some data to servers.
> 
> We've been thinking about getting metadata to the in-instance parts on the
> server and whether the resources you are building can serve the purpose.
> I.e. pass and endpoint to the SoftwareConfig resources to the instance and
> let the instance query the metadata from the resource. Sounds like this is
> what you had in mind, so that would be a good point for integrating the
> work. In the meantime, we can think of some shortcuts.
> 

Note that os-collect-config is intended to be a light-weight generic
in-instance agent to do exactly this. Watch for Metadata changes, and
feed them to an underlying tool in a predictable interface. I'd hope
that any of the appliers would mostly just configure os-collect-config
to run a wrapper that speaks os-collect-config's interface.

The interface is defined in the README:

https://pypi.python.org/pypi/os-collect-config

It is inevitable that we will extend os-collect-config to be able to
collect config data from whatever API these config applier resources
make available. I would suggest then that we not all go off and reinvent
os-collect-config for each applier, but rather enhance os-collect-config
as needed and write wrappers for the other config tools which implement
its interface.

os-apply-config already understands this interface for obvious reasons.

Bash scripts can use os-apply-config to extract individual values, as
you might see in some of the os-refresh-config scripts that are run as
part of tripleo. I don't think anything further is really needed there.

For chef, some kind of ohai plugin to read os-collect-config's collected
data would make sense.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT software configuration refined after design summit discussions

2013-11-20 Thread Clint Byrum
Excerpts from Mike Spreitzer's message of 2013-11-20 13:46:25 -0800:
> Clint Byrum  wrote on 11/19/2013 04:28:31 PM:
> > From: Clint Byrum 
> > To: openstack-dev , 
> > Date: 11/19/2013 04:30 PM
> > Subject: Re: [openstack-dev] [Heat] HOT software configuration 
> > refined after design summit discussions
> > 
> > Excerpts from Steve Baker's message of 2013-11-19 13:06:21 -0800:
> > > On 11/20/2013 09:50 AM, Clint Byrum wrote:
> > > > Excerpts from Steve Baker's message of 2013-11-18 12:52:04 -0800:
> > > >> Regarding apply_config/remove_config, if a SoftwareApplier resource 
> is
> > > >> deleted it should trigger any remove_config and wait for the server 
> to
> > > >> acknowledge when that is complete. This allows for any
> > > >> evacuation/deregistering workloads to be executed.
> > > >>
> > > > I'm a little worried about the road that leads us down. Most 
> configuration
> > > > software defines forward progress only. Meaning, if you want 
> something
> > > > not there, you don't remove it from your assertions, you assert that 
> it
> > > > is not there.
> 
> I am worried too.  But I do not entirely follow your reasoning.  When I 
> UPDATE a stack with a new template, am I supposed to write in that 
> template not just what I want the stack to be but also how that differs 
> from what it currently is?  That is not REST.  Not that I am a total REST 
> zealot, but I am a fan of managing in terms of desired state.  But I agree 
> there is a conflict between defining a 'remove' operation and the "forward 
> progress only" mindset of most config tooling.
> 

I am worried about the explosion of possibilities that comes from trying
to deal with all of the diff's possible inside an instance. If there is an
actual REST interface for a thing, then yes, let's use that. For instance,
if we are using docker, there is in fact a very straight forward way to
say "remove entity X". If we are using packages we have the same thing.
However, if we are just trying to write chef configurations, we have to
write reverse chef configurations.

What I meant to convey is "let's give this piece of the interface a lot of
thought". Not "this is wrong to even have." Given a couple of days now,
I think we do need "apply" and "remove". We should also provide really
solid example templates for this concept.

> > > > ...
> > > A specific use-case I'm trying to address here is tripleo doing an
> > > update-replace on a nova compute node. The remove_config contains the
> > > workload to evacuate VMs and signal heat when the node is ready to be
> > > shut down. This is more involved than just "uninstall the things".
> > > 
> > > Could you outline in some more detail how you think this could be 
> done?
> > > 
> > 
> > So for that we would not remove the software configuration for the
> > nova-compute, we would assert that the machine needs vms evacuated.
> > We want evacuation to be something we explicitly do, not a side effect
> > of deleting things.
> 
> Really?  You want to force the user to explicitly say "evacuate the VMs" 
> in all the various ways a host deletion can happen?  E.g., when an 
> autoscaling group of hosts shrinks?
> 

Autoscaling doesn't really fly with stateful services IMO. Also for
TripleO's use case, auto-scaling is not really a high priority. Hardware
isn't nearly as easily allocatable as VM's.

Anyway, there is a really complicated work-flow for decomissioning
any stateful service, and it differs wildly between them. I do want to
have a place to define that work-flow and reliably trigger it when it
needs to be triggered. I do not want it to _only_ be available in the
"delete this resource" case, and I also do not want it to _always_
be run in that case, as I may legitimately be destroying the data too.
I need a way to express that intention, and in my mind, the way to do
that is to first complete an evacuation and then delete the thing.

Better ideas are _most_ welcome.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Continue discussing multi-region orchestration

2013-11-20 Thread Clint Byrum
Excerpts from Mike Spreitzer's message of 2013-11-20 11:09:34 -0800:
> OTOH, the more we restrict what can be done, the less useful this really 
> is.

I would be more specific and say it as "...the less divergent behavior
is actually possible."

It will be quite a bit more useful if it is boring and restricted but
does what it claims to do extremely well.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] How to stage client major releases in Gerrit?

2013-11-20 Thread Clint Byrum
Excerpts from Mark Washenberger's message of 2013-11-20 10:14:42 -0800:
> Hi folks,
> 
> The project python-glanceclient is getting close to needing a major release
> in order to finally remove some long-deprecated features, and to make some
> minor adjustments that are technically backwards-incompatible.
> 
> Normally, our release process works great. When we cut a release (say
> 1.0.0), if we realize it doesn't contain a feature we need, we can just add
> the feature and release a new minor version (say 1.1.0). However, when it
> comes to cutting out the fat for a major release, if we find a feature that
> we failed to remove before releasing 1.0.0, we're basically screwed. We
> have to keep that feature around until we feel like releasing 2.0.0.
> 
> In order to mitigate that risk, I think it would make a lot of sense to
> have a place to stage and carefully consider all the breaking changes we
> want to make. I also would like to have that place be somewhere in Gerrit
> so that it fits in with our current submission and review process. But if
> that place is the 'master' branch and we take a long time, then we can't
> really release any bug fixes to the v0 series in the meantime.
> 
> I can think of a few workarounds, but they all seem kinda bad. For example,
> we could put all the breaking changes together in one commit, or we could
> do all this prep in github.
> 
> My question is, is there a correct way to stage breaking changes in Gerrit?
> Has some other team already dealt with this problem?
> 
> DISCLAIMER:
> For the purposes of this discussion, it will be utterly unproductive to
> discuss the relative merits of backwards-breaking changes. Rather let's
> assume that all breaking changes that would eventually land in the next
> major release are necessary and have been properly communicated well in
> advance. If a given breaking change is *not* proper, well that's the kind
> of thing I want to catch in gerrit reviews in the staging area!

I understand what you're trying to do with this disclaimer. The message
above just _screams_ for this discussion, so why not cut it off at the
pass? However, glanceclient being a library, not discussing the fact
that you're breaking an established API is like not discussing ice at
the north pole.

If you want to be able to change interfaces without sending a missile
up the tail pipe of every project who depends on your code, call it
glanceclient2. That solves all of your stated problems from above. You can
still deprecate glanceclient and stop maintaining it after some overlap
time. And if users hate glanceclient2, then they can keep glanceclient
alive with all of its warts.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone][py3] Usage of httpretty

2013-11-20 Thread Clint Byrum
Excerpts from Chuck Short's message of 2013-11-20 19:21:14 -0800:
> Hi,
> 
> So maybe if it gets to the point where it gets too be much of a porblem we
> should just put it on stackforge.
> 

That should be the last resort, when upstream is deemed dead. I'm guessing
upstream would not like to fade away into extinction as users migrate
to python 3. We should be comfortable helping out any way we can upstream.

The library sounds like a viable option for test isolation. Perhaps we
can just offer to help fix upstream's py3k CI.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT software configuration refined after design summit discussions

2013-11-20 Thread Clint Byrum
Excerpts from Mike Spreitzer's message of 2013-11-20 15:16:45 -0800:
> Clint Byrum  wrote on 11/20/2013 05:41:16 PM:
> 
> > Autoscaling doesn't really fly with stateful services IMO.
> 
> I presume you're concerned about the "auto" part, not the "scaling".  Even 
> a stateful group is something you may want to scale; it just takes a more 
> complex set of operations to accomplish that.  If we can make a Heat 
> autoscaling group invoke the right set of operations, why not?
> 

It is most definitely possible and necessary. We _must_ do this.

It does not fly with today's limited auto-scaling.

If we can get the link between orchestration and work-flow working well,
the end result should be harmonious automation and then automatic scaling
will indeed be possible for stateful services.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Propose "project story wiki" idea

2013-11-20 Thread Clint Byrum
Excerpts from Boris Pavlovic's message of 2013-11-19 21:33:08 -0800:
> Hi stackers,
> 
> 
> Currently what I see is growing amount of interesting projects, that at
> least I would like to track. But reading all mailing lists, and reviewing
> all patches in all interesting projects to get high level understanding of
> what is happing in project now, is quite hard or even impossible task (at
> least for me). Especially after 2 weeks vacation =)
> 
> 
> The idea of this proposal is that every OpenStack project should have
> "story" wiki page. It means to publish every week one short message that
> contains most interesting updates for the last week, and high level road
> map for future week. So reading this for 10-15 minutes you can see what
> changed in project, and get better understanding of high level road map of
> the project.

I like the idea, but I don't like having _more_ wiki pages.

I think the weekly IRC meeting would be a good place for this to be
maintained.

We can have an agenda item "Updates". Before the meeting people can add
any and the chair can paste those in. Then any that people come up with
during the meeting can be stated by attendees.

 #topic Updates
 * Core reviewer added: foo-person, congratulations!
 * Completed py3k fixes for python-barclient

This way the updates are sent along with any other relevant discussions
from the meeting, and subscribers can still just follow pages they're
already used to following.

Meanwhile the Updates topic can be automatically extracted from the
meeting logs and highlighted in a special section of the main project
wiki page. Perhaps the same automation can maintain a page which includes
all of the other projects' updates for a one stop shop.

If people like this idea and want to try it out I'd be happy to throw
together a script to do the log extraction.

Anyway, less manual == more fun == more engagement.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Propose "project story wiki" idea

2013-11-21 Thread Clint Byrum
Excerpts from Boris Pavlovic's message of 2013-11-21 00:16:04 -0800:
> Clint,
> 
> The main idea is to have processed by human history of project.
> 
> It is really impossible to aggregate automatically all data from different
> sources:
> IRC (main project chat/dev chat/meetings), Mailing Lists, Code, Reviews,
> Summit discussions, using project specific knowledge and history of  the
> project.To get short messages like here
> https://wiki.openstack.org/wiki/Rally/Updates
> 
> So the idea is that in each project we should have the persons that will
> aggregate for others all these sources and present really short, high level
> view of situation. And these messages should be in one place (wiki/or other
> platform (not mailing lists)) for project. So we will be able quick to get
> what happens with project for last few months and what are current goals.
> This will be also very very useful for new contributors.
> 
> So Aggregation of data is good (and should be done), but it is not enough..
> 

I did not suggest aggregation of data. We have TONS of that, and we
don't need more.

I suggested a very simple way for project leaders and members to maintain
the current story during the meetings.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Introducing the new OpenStack service for Containers

2013-11-22 Thread Clint Byrum
Excerpts from Thierry Carrez's message of 2013-11-22 01:50:39 -0800:
> Tim Bell wrote:
> > Can we make sure that the costs for the end users are also considered as
> > part of this ?
> > 
> > -  Configuration management will need further modules
> > -  Dashboard confusion as we get multiple tabs
> > -  Accounting, Block Storage, Networking, Orchestration
> > confusion as the concepts diverge
> > 
> > Is this really a good idea to create another project considering the
> > needs of the whole openstack community ?
> 
> Personally, it will have to prove a really different API and set of use
> cases to justify the cost of a separate project. I'm waiting to see what
> they come up with, but IMHO it's "compute" in both cases. We've seen
> with the libvirt-sandbox discussion that using technology (hypervisor
> vs. container) to draw the line between the use cases is a bit
> over-simplifying the problem.
> 

Agreed, I think it has been over simplified, but that is what you do
when you're not driven by a well understood real use-case, something I
have yet to see from this discussion.

> I don't really want us to create a container service and end up
> implementing the same hypervisor backends than in Nova, just because
> hypervisors can perfectly also be used to serve lightweight
> application-centric workloads. Or the other way around (keep Docker
> support in Nova since you can perfectly run an OS in a container). At
> first glance, extending the Nova API to also cover lightweight
> app-centric use cases would avoid a ton of duplication...
> 

Agreed. There are a few weird things that come to mind though. One of
those is that I imagine users would like to do something like this:

host_id=$(container-thing allocate-host --flavor small  appserver)
db_id=$(container-thing allocate-host --flavor huge dbserver)
app_id=$(container-thing run --host $host_id --image app-image)
proxy_id=$(container-thing run --host $host_id --image proxy-image)
cache_id=$(container-thing run --host $host_id --image cache-image)
db_id=$(container-thing run --host $db_id)

As in, they'd probably like to have VMs spun up and then chopped up
into containers. If this is implemented first inside nova, that may end
up being a rats nest and hard to separate later.  The temptation to use
private API's is really strong. But if it is outside nova, the separation
stays clear and the two can be used without one-another very easily.

> If the main concern is to keep Nova small and manageable, I'd rather rip
> out pieces like nova-network or the scheduler, which are clearly not
> "compute".
> 

Indeed, and those things are under way. :)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Solum] Working group on language packs

2013-11-22 Thread Clint Byrum
Excerpts from Clayton Coleman's message of 2013-11-22 21:43:40 -0800:
> 
> > On Nov 22, 2013, at 9:54 PM, Monty Taylor  wrote:
> > 
> > 
> > 
> >> On 11/22/2013 11:34 AM, Clayton Coleman wrote:
> >> I have updated the language pack (name subject to change) blueprint
> >> with the outcomes from the face2face meetings, and drafted a
> >> specification that captures the discussion so far.  The spec is
> >> centered around the core idea of transitioning base images into
> >> deployable images (that can be stored in Nova and sent to Glance).
> >> These are *DRAFT* and are intended for public debate.
> >> 
> >> https://blueprints.launchpad.net/solum/+spec/lang-pack 
> >> https://wiki.openstack.org/wiki/Solum/FeatureBlueprints/BuildingSourceIntoDeploymentArtifacts
> >> 
> >> Please take this opportunity to review these documents and offer
> >> criticism and critique via the ML - I will schedule a follow up deep
> >> dive for those who expressed interest in participation [1] after US
> >> Thanksgiving.
> > 
> > Hi!
> > 
> > I'd strongly suggest looking at the diskimage-builder project that's
> > part of the TripleO program. Someone has already done a POC of turning
> > it in to an aaS, and there are already people working on tying
> > diskimage-builder elements and heat templates. Given that OpenStack has
> > prior art and work in this direction, you should be able to accelerate
> > getting to your goals pretty quickly.
> 
> diskimage-builder is definitely a primary tool choice for an openstack 
> deployer creating vm images, and should certainly be promoted where possible. 
>  I'll add an example to the doc.
> 
> The spec does try to be agnostic to the actual image creation technology in 
> play - organizations using containers or Windows images may have alternative 
> preferences about the underlying mechanism by which they generate images.  
> Decoupling the image creation from how the image is used is a key goal, 
> especially since organizations often want to separate environment preparation 
> and development along role lines.

Windows images are special, yes. For those, perhaps chat with the Murano
folk?

Containers will work fine in diskimage-builder. One only needs to hack
in the ability to save in the container image format rather than qcow2.

I actually think diskimage-builder would be really useful for container
building, as it doesn't make any assumptions about things like having a
kernel. In fact we've discussed the possibility of using lxc to do the
image builds instead of chroot so that the builds would be more
isolated from the build host.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Solum] git-Integration working group

2013-11-24 Thread Clint Byrum
Excerpts from Adrian Otto's message of 2013-11-22 18:51:16 -0800:
> Monty,
> 
> On Nov 22, 2013, at 6:24 PM, Monty Taylor 
>  wrote:
> 
> > On 11/22/2013 05:37 PM, Krishna Raman wrote:
> >> Hello all,
> >> 
> >> I would like to kickoff the Git integration discussion. Goal of this
> >> subgroup is to go through the git-integration blueprint [1] and break it
> >> up into smaller blueprints that we can execute on.
> >> 
> >> We have to consider 2 workflows:
> >>   1) For Milestone 1, pull based git workflow where user uses a public
> >> git repository (possibly on github) to trigger the build
> >>   2) For later milestones, a push based workflow where the git
> >> repository is maintained by Solum
> > 
> > Hi!
> 
> Hi, thanks for chiming in here.
> 
> > I'm a little disappointed that we've decided to base the initial
> > workflow on something that is not related to the world-class git-based
> > developer tooling that the OpenStack project has already produced. We
> > have a GIANT amount of tooling in this space, and it's all quite
> > scalable. There is also the intent by 3 or 4 different groups to make it
> > more re-usable/re-consumable, including thoughts in making sure that we
> > can drive it from and have it consume heat.
> 
> The initial work will be something pretty trivial. It's just a web hook on a 
> git push. The workflow in this case is not customizable, and has basically no 
> features. The intent is to iterate on this to make it much more compelling 
> over time, soon after the minimum integration, we will put a real workflow 
> system in place. We did discuss Zuul and Nodepool, and nobody had any 
> objection to learning more about those. This might be a bit early in our 
> roadmap to be pulling them in, but if there is an easy way to use them early 
> in our development, we'd like to explore that.
> 

Zuul and nodepool are things to optimize large scale testing. git-review
and gerrit, on the other hand, are the frontend that it sounds like
this trivial "just a push" process would try to replace. I don't think
it is wise to ignore the success of those two pieces of the OpenStack
infrastructure. If what you're doing is only ever going to be a simple
push, so be it. However, I doubt that it will remain so simple.

Is there some reason _not_ to just consume these as-is?

> >> Devdatta has created 2 blueprints for consideration: [2] [3]
> >> 
> >> I have set up a doodle to poll for a /recurring/ meeting time for this
> >> subgroup: http://doodle.com/7wypkzqe9wep3d33#table   (Timezone support
> >> is enabled)
> >> 
> >> Currently the plan is to try G+ hangouts to run this meetings and scribe
> >> on #solum. This will limit us to a
> >> max of 10 participants. If we have more interest, we will need to see
> >> how to change the meetings.
> > 
> > We have IRC meeting channels for meetings. They are logged - and they
> > have the benefit that they do not require non-Open Source software to
> > access. If you have them in IRC, the team from OpenStack who is already
> > working on developer workflow around git can potentially participate.
> > 
> > I don't mean to be negative, but if you want to be a PaaS for OpenStack,
> > I would strongly consider not using G+ when we have IRC, and I would
> > strongly suggest engaging with the Infra projects that already know how
> > to do git-based workflow and action triggering.
> 
> We just finished holding the Solum Community Design Workshop in San 
> Francisco. We had both irc and G+ in addition to etherpad for shared 
> notetaking. What we found is that that collaboration was faster and more 
> effective when we used the G+ tool. The remote participants had a strong 
> preference for it, and requested that we use it for the breakout meetings as 
> well. The breakout design meetings will have a scribe who will transcribe the 
> interaction in IRC so it will also be logged.
> 

We struggled with this in Ubuntu as well. Ultimately, our fine friends at
Google have created what seems to be one of the most intuitive distributed
collaboration tools the world has ever seen. I think Monty is right,
and that we should strive to use the tools the rest of OpenStack uses
whenever possible, and we should strive to be 100% self hosted.

However, I do think G+ has enough benefit at times to deal with the fact
that it is not free and we can't really host our own.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unwedging the gate

2013-11-25 Thread Clint Byrum
Excerpts from Joe Gordon's message of 2013-11-24 21:00:58 -0800:
> Hi All,
> 
> TL;DR Last week the gate got wedged on nondeterministic failures. Unwedging
> the gate required drastic actions to fix bugs.
> 
> 


(great write-up, thank you for the details, and thank you for fixing
it!)

> 
> Now that we have the gate back into working order, we are working on the
> next steps to prevent this from happening again.  The two most immediate
> changes are:
> 
>- Doing a better job of triaging gate bugs  (
>
> http://lists.openstack.org/pipermail/openstack-dev/2013-November/020048.html
> ).
> 
> 
>- In the next few days we will remove  'reverify no bug' (although you
>will still be able to run 'reverify bug x'.
> 

I am curious, why not also disable 'recheck no bug'?

I see this as a failure of bug triage. A bug that has more than 1
recheck/reverify attached to it is worth a developer's time. The data
gathered through so many test runs is invaluable when chasing races like
the ones that cause these intermittent failures. If every core dev of
every project spent 10 working minutes every day looking at the rechecks
page to see if there is an untriaged recheck there, or just triaging bugs
in general, I suspect we'd fix these a lot quicker.

I do wonder if we would be able to commit enough resources to just run
two copies of the gate in parallel each time and require both to pass.
Doubling the odds* that we will catch an intermittent failure seems like
something that might be worth doubling the compute resources used by
the gate.

*I suck at math. Probably isn't doubling the odds. Sounds
good though. ;)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unwedging the gate

2013-11-25 Thread Clint Byrum
Excerpts from Robert Collins's message of 2013-11-25 01:30:11 -0800:
> On 25 November 2013 22:23, Clint Byrum  wrote:
> 
> > I do wonder if we would be able to commit enough resources to just run
> > two copies of the gate in parallel each time and require both to pass.
> > Doubling the odds* that we will catch an intermittent failure seems like
> > something that might be worth doubling the compute resources used by
> > the gate.
> >
> > *I suck at math. Probably isn't doubling the odds. Sounds
> > good though. ;)
> 
> We already run the code paths that were breaking 8 or more times.
> Hundreds of times in fact for some :(.
> 
> The odds of a broken path triggering after it gets through, assuming
> each time we exercise it is equally likely to show it, are roughly
> 3/times-exercised-in-landing. E.g. if we run a code path 300 times and
> it doesn't show up, then it's quite possible that it has a 1%
> incidence rate.

We don't run through 300 times of the same circumstances. We may pass
through indidivual code paths that have a race condition 300 times, but
the circumstances are probably only right for failure in 1 or 2 of them.

1% overall then, doesn't matter so much as how often does it fail when
the conditions for failure are optimal. If we can increase the ocurrences
of the most likely failure conditions, then we do have a better chance
of catching the failure.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unwedging the gate

2013-11-25 Thread Clint Byrum
Excerpts from Monty Taylor's message of 2013-11-25 06:52:02 -0800:
> 
> On 11/25/2013 04:23 AM, Clint Byrum wrote:
> > Excerpts from Joe Gordon's message of 2013-11-24 21:00:58 -0800:
> >> Hi All,
> >>
> >> TL;DR Last week the gate got wedged on nondeterministic failures. Unwedging
> >> the gate required drastic actions to fix bugs.
> >>
> >>
> > 
> > 
> > (great write-up, thank you for the details, and thank you for fixing
> > it!)
> > 
> >>
> >> Now that we have the gate back into working order, we are working on the
> >> next steps to prevent this from happening again.  The two most immediate
> >> changes are:
> >>
> >>- Doing a better job of triaging gate bugs  (
> >>
> >> http://lists.openstack.org/pipermail/openstack-dev/2013-November/020048.html
> >> ).
> >>
> >>
> >>- In the next few days we will remove  'reverify no bug' (although you
> >>will still be able to run 'reverify bug x'.
> >>
> > 
> > I am curious, why not also disable 'recheck no bug'?
> 
> recheck no bug still has a host of valid use cases. Often times I use it
> when I upload a patch, it fails because of a thing somewhere else, we
> fix that, and I need to recheck the patch because it should work now.
> 
> It's also not nearly as dangerous as reverify no bug.
> 

"...somewhere else, we fix that..." -- Would it be useful to track that
in a bug? Would that help elastic-recheck work better if all the problems
caused by a bug elsewhere were reported as bugs?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-25 Thread Clint Byrum
Excerpts from Tim Schnell's message of 2013-11-25 14:51:39 -0800:
> Hi Steve,
> 
> As one of the UI developers driving the requirements behind these new
> blueprints I wanted to take a moment to assure you and the rest of the
> Openstack community that the primary purpose of pushing these requirements
> out to the community is to help improve the User Experience for Heat for
> everyone. Every major UI feature that I have implemented for Heat has been
> included in Horizon, see the Heat Topology, and these requirements should
> improve the value of Heat, regardless of the UI.
> 
> 
> Stack/template metadata
> We have a fundamental need to have the ability to reference some
> additional metadata about a template that Heat does not care about. There
> are many possible use cases for this need but the primary point is that we
> need a place in the template where we can iterate on the schema of the
> metadata without going through a lengthy design review. As far as I know,
> we are the only team attempting to actually productize Heat at the moment
> and this means that we are encountering requirements and requests that do
> not affect Heat directly but simply require Heat to allow a little wiggle
> room to flesh out a great user experience.
> 

Wiggle room is indeed provided. But reviewers need to understand your
motivations, which is usually what blueprints are used for. If you're
getting push back, it is likely because your blueprints to not make the
use cases and long term vision obvious.

> There is precedence for an optional metadata section that can contain any
> end-user data in other Openstack projects and it is necessary in order to
> iterate quickly and provide value to Heat.
> 

Nobody has said you can't have meta-data on stacks, which is what other
projects use.

> There are many use cases that can be discussed here, but I wanted to
> reiterate an initial discussion point that, by definition,
> "stack/template_metadata" does not have any hard requirements in terms of
> schema or what does or does not belong in it.
> 
> One of the initial use cases is to allow template authors to categorize
> the template as a specific "type".
> 
> template_metadata:
> short_description: Wordpress
> 
> 

Interesting. Would you support adding a "category" keyword to python so
we don't have to put it in setup.cfg and so that the egg format doesn't
need that section? Pypi can just parse the python to categorize the apps
when they're uploaded. We could also have a file on disk for qcow2 images
that we upload to glance that will define the meta-data.

To be more direct, I don't think the templates themselves are where this
meta-data belongs. A template is self-aware by definition, it doesn't
need the global metadata section to tell it that it is WordPress. For
anything else that needs to be globally referenced there are parameters.
Having less defined inside the template means that you get _more_ wiggle
room for your template repository.

I 100% support having a template catalog. IMO it should be glance,
which is our catalog service in OpenStack. Who cares if nova or heat are
consuming images or templates. It is just sharable blobs of data and
meta-data in a highly scalable service. It already has the concept of
global and tenant-scope. It just needs an image type of 'hot' and then
heat can start consuming templates from glance. And the template authors
should maintain some packaging meta-data in glance to communicate to
users that this is "Wordpress" and "Single-Node". If Glance's meta-data
is too limiting, expand it! I'm sure image authors and consumers would
appreciate that.

> This would let the client of the Heat API group the templates by type
> which would create a better user experience when selecting or managing
> templates. The the end-user could select "Wordpress" and drill down
> further to select templates with different options, "single node", "2 web
> nodes", etc...
> 

That is all api stuff, not language stuff.

> Once a feature has consistently proven that it adds value to Heat or
> Horizon, then I would suggest that we can discuss the schema for that
> feature and codify it then.
> 
> In order to keep the discussion simple, I am only responding to the need
> for stack/template metadata at the moment but I'm sure discussions on the
> management api and template catalog will follow.
> 

Your example puts the template catalog in front of this feature, and I
think that exposes this feature as misguided.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all project] Treating recently seen recheck bugs as critical across the board

2013-11-26 Thread Clint Byrum
Excerpts from Thierry Carrez's message of 2013-11-26 03:23:51 -0800:
> Dolph Mathews wrote:
> > On Mon, Nov 25, 2013 at 8:12 PM, Robert Collins
> > mailto:robe...@robertcollins.net>> wrote:
> > 
> > So my proposal is that we make it part of the base hygiene for a
> > project that any recheck bugs being seen (either by elastic-recheck or
> > manual inspection) be considered critical and prioritised above
> > feature work.
> > 
> > I agree with the notion here (that fixing transient failures is
> > critically high priority work for the community) -- but marking the bug
> > as "critical" priority is just a subjective abuse of the priority field.
> > A non-critical bug is not necessarily non-critical work. The "critical"
> > status should be reserved for issues that are actually non-shippable,
> > catastrophically breaking issues.
> 
> It's a classic bugtracking dilemma where the "Importance" field is both
> used to describe bug impact and priority... while they don't always match.
> 

If I'm on the fence between 1 importance or the other, I look at the bug
list of the two importance lists:

For instance, there are all 122 High importance bugs in Nova, and 6
Critical bugs.

If we are comfortable with developers choosing to fix all 122 of the other
High bugs before this bug, then make it High. If not, make it Critical.
Likewise, if we are uncomfortable with this bug being chosen before any
of the 6 Critical bugs, then make it High.

I realize those two choices could make a person uncomfortable and wish for
something in-between like "Hitical" or "Criticigh", but micro-management
is no way to actually get things done and it does only take a few seconds
to reprioritize as we add insight and data over time.

> That said, the "impact" of those bugs, considering potential development
> activity breakage, *is* quite critical (they all are timebombs which
> will create future gate fails if not handled at top priority).
> 
> So I think marking them Critical + tagging them is not that much of an
> abuse, if we start including the gate impact in our bug Impact
> assessments. That said, I'm also fine with High+Tag, as long as it
> triggers the appropriate fast response everywhere.
> 

IMO the tags are a distraction to triage. Critical or High is enough of
a conundrum to resolve. The tags will certainly help guide trackers, and
they should add them, but the person doing triage should mostly focus on
"will the patient die?" type questions.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-26 Thread Clint Byrum
Excerpts from Tim Schnell's message of 2013-11-26 13:24:22 -0800:
> So the originally question that I attempted to pose was, "Can we add a
> schema-less metadata section to the template that can be used for a
> variety of purposes?". It looks like the answer is no, we need to discuss
> the features that would go in the metadata section and add them to the HOT
> specification if they are viable. I don't necessarily agree with this
> answer but I accept it as viable and take responsibility for the
> long-winded process that it took to get to this point.
> 
> I think some valid points have been made and I have re-focused my efforts
> into the following proposed solution.
> 
> I am fine with getting rid of the concept of a schema-less metadata
> section. If we can arrive at a workable design for a few use cases then I
> think that we won't need to discuss any of the options that Zane mentioned
> for handling the metadata section, comments, separate file, or in the
> template body.
> 
> Use Case #1
> I see valid value in being able to group templates based on a type or
> keyword. This would allow any client, Horizon or a Template Catalog
> service, to better organize and handle display options for an end-user.
> 
> I believe that Ladislav initially proposed a solution that will work here.
> So I will second a proposal that we add a new top-level field to the HOT
> specification called "keywords" that contains this template type.
> 
> keywords: wordpress, mysql, etcŠ
> 

What is the "use case" here. For me, a use case needs to be specific to
be useful at helping to guide more generic feature design. What I see
above is a generic feature specification with no actual concrete need
for it. These keywords should be in the template description, which
should be full-text searchable anyway. "Group templates based on a type
or keyword." could mean a lot of things to a lot of different classes
of Heat users.

> 
> Use Case #2
> The template author should also be able to explicitly define a help string
> that is distinct and separate from the description of an individual
> parameter. An example where this use case originated was with Nova
> Keypairs. The description of a keypair parameter might be something like,
> "This is the name of a nova key pair that will be used to ssh to the
> compute instance." A help string for this same parameter would be, "To
> learn more about nova keypairs click on this help article."
> 
> I propose adding an additional field to the parameter definition:
> 
> Parameters:
> :
> description: This is the name of a nova key pair that will be 
> used to
> ssh to the compute instance.
> help: To learn more about nova key pairs click on this  href="/some/url/">help article.
>

+1. A help string per parameter is a fantastic idea. Sounds like a nice
analog to doc-strings.

> Use Case #3
> Grouping parameters would help the client make smarter decisions about how
> to display the parameters for input to the end-user. This is so that all
> parameters related to some database resource can be intelligently grouped
> together. In addition to grouping these parameters together, there should
> be a method to ensuring that the order within the group of parameters can
> be explicitly stated. This way, the client can return a group of database
> parameters and the template author can indicate that the database instance
> name should be first, then the username, then the password, instead of
> that group being returned in a random order.
> 
> Parameters:
> db_name:
> group: db
> order: 0
> db_username:
> group: db
> order: 1
> db_password:
> group: db
> order: 2
> web_node_name:
> group: web_node
> order: 0
> keypair:
> group: web_node
> order: 1
>

+1 for grouping. Your use case is perfectly specified above. Love it.

However, your format feels very forced. Integers for ordering? We have
lists for that.

How about this:

parameters:
  db:
type: group
parameters:
  - db_name
  - db_username
  - db_password
  web_node:
type: group
parameters:
  - web_node_name
  - keypair
  db_name:
type: string
  ...


This way we are just providing the groupings outside of the parameters
themselves. Using a list means the order is as it appears here.

Another option is to allow specifying whole parameters in-line there,
like

parameters:
  db:
type: group
parameters:
  - db_name:
type: string

But that opens up a can of worms that we can leave closed for now and
just do the above.

> These are the use cases that have been clearly defined. The original
> purpose of the metadata section was to "future-proof" (I say future-proof,
> you say pre-optimize ;) ) rapid iterations to potential client design. The
> intent of the strategy was so that we did not overburden the

Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-26 Thread Clint Byrum
Excerpts from Ladislav Smola's message of 2013-11-26 06:04:12 -0800:
> Hello,
> 
> seems too big to do the inline comments, so just a few notes here:
> 
> If we truly want to have Templates portable, it would mean to have the 
> 'metadata' somehow standardised, right?

Can you point to another portable application distribution system where
meta-data about the applications has been standardized in the language
itself? Can you give us data on how successful that is?

I can point to plenty where the meta-data is specified outside the
application itself, because it makes sense that these are two different
concerns.

> Otherwise if every UI will add their own metadata, then I hardly see the 
> templates as portable. I think first step would be then to delete
> the metadata and add your own, unless you are fine to have 80% of the 
> template some metadata you don't use. That also won't
> help the readability. What will help a readability are the verbose comments.
>

Actually that is just the thing. Keeping them portable means making them
useful in different deployments without changing them. Deployments are
not all going to want to organize things by the same set of tags.

Consider a private cloud used for deploying IT apps internally. They may
want to tag SugarCRM as 'sales' and OpenERP as 'manufacturing'. These
are separate concerns, so they should not be in the same template.

What _would_ be useful would be curation of those packaged Heat apps
into a generally useful "default" repository for Heat deployers to take
advantage of. A pypi for Heat, if you will. That would help to
homogenize things and prevent wasteful forking of the most generic
templates. But the decisions of those curators should not be set in
stone for all deployers.

> I am not really sure how long it can take to add new specialized tags, 
> that are used only in Horizon and are well documented. I think showing
> this, should get the patch merged very quickly. That seems to me like a 
> portable solution.
> 
> IMO for the template catalogue we would probably need a new service, 
> something like Glance, so that's probably a more distant future.
>

Why not just use glance? Where is this belief coming from that Glance
would be hard to add what is basically a single image type to? I
understand SOA and we need to separate things. But "registry of blobs
of data" has a service.

If people want to expose a user-uploadable HOT glance but not
a user-uploadable image glance, then have two endpoints, and two
glances. But don't write a whole new piece of software just because
modifying an existing one seems hard.

> For the use-cases:
> ---
> 
> ad 1)
> Something more general next to Description may be more useful, like 
> keywords, packages or components.
> Example:
> 
> Description...
> Keywords: wordpress, mysql...
> 
> Or you could parse it from e.g. packages (though that is not always 
> used, so being able to write it explicitly might be handy)
>

If you take out the catalog though, this has no place in the template.

> ad 2)
> Maybe adding something like 'author' tag may be a good idea, though you 
> can find all the history in git repo,
> given you use https://github.com/openstack/heat-templates . If you have 
> different repo, adding something like
> Origin: https://github.com/openstack/heat-templates maybe?
> 

Both of those things work great if you make git repo+meta-data file the
packaging format.

Things like Origin: get out of date really fast in my experience because
they are informational and not actually important to the use of the
package/code/etc.

> ad 3)
> So having a fix and documented schema seems to be a good way to be 
> portable, at least to me. I am not
> against UI only tags inside the template, that are really useful for 
> everybody. We will find out by collectively
> reviewing that, which usually brings some easier solution.
> 

UI only tags are awesome, and I like them a lot.

> Or you don't think, it will get too wild to have some 'metadata' section 
> completely ignored by Heat? Seems
> to me like there will be a lot of cases, when people won't push their 
> template to upstream, because of the
> metadata they have added to their templates, that nobody else will ever 
> use. Is somebody else concerned
> about this?

I actually think such a section will be abused heavily to have different
templates per deployment. Keeping this stuff out of the template means
deployers who need to arrange things differently can consume the
template without modifying it.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-27 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2013-11-27 08:09:33 -0800:
> In the longer term, there seems to be a lot of demand for some sort of 
> template catalog service, like Glance for templates. (I disagree with 
> Clint that it should actually _be_ Glance the project as we know it, for 
> the reasons Steve B mentioned earlier, but the concept is right.) And 
> this brings us back to a very similar situation to the operator-provided 
> template catalog (indeed, that use case would likely be subsumed by this 
> one).
> 

Could you provide a stronger link to Steve B's comments, I think I
missed them. Thanks!

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-27 Thread Clint Byrum
Excerpts from Tim Schnell's message of 2013-11-27 09:16:24 -0800:
> 
> On 11/27/13 10:09 AM, "Zane Bitter"  wrote:
> 
> >On 26/11/13 22:24, Tim Schnell wrote:
> >> I propose adding an additional field to the parameter definition:
> >> 
> >> Parameters:
> >> :
> >> description: This is the name of a nova key pair that will be 
> >> used to
> >> ssh to the compute instance.
> >> help: To learn more about nova key pairs click on this  >> href="/some/url/">help article.
> >
> >(Side note: you're seriously going to let users stick HTML in the
> >template and then have the dashboard display it?  Yikes.)
> 
> FWIW, I said the exact same thing to Keith Bray and his answer was, "why
> not?"
> 

Because it is a cross site scripting problem. You are now allowing users
to publish HTML as your site. If you can guarantee that users will only
ever be shown their own template help, then it is o-k. But that seems
like an unlikely guarantee.

Just use markdown, it has become the standard for these things.

> The UI is already making determinations about what HTML to generate based
> on the template. For example, the parameter label to display just
> unslugifies the parameter key. This is a somewhat tangential discussion
> though, and I do have reservations about it. Maybe Keith can jump in and
> defend this better.
> 

Generating HTML is not displaying user input as HTML. There is a rather
large difference.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-27 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2013-11-27 08:58:16 -0800:
> This use case is sort of a providence case. Where did the stack come from so 
> I can find out more about it.
> 

This exhibits similar problems to our Copyright header problems. Relying
on authors to maintain their authorship information in two places is
cumbersome and thus the one that is not automated will likely fall out
of sync fairly quickly.

> You could put a git commit field in the template itself but then it would be 
> hard to keep updated.
> 

Or you could have Heat able to pull from any remote source rather than
just allowing submission of the template directly. It would just be
another column in the stack record. This would allow said support person
to see where it came from by viewing the stack, which solves the use case.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][horizon]Heat UI related requirements & roadmap

2013-11-27 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2013-11-27 10:57:56 -0800:
> On 11/27/2013 01:28 AM, Clint Byrum wrote:
> >> I propose adding an additional field to the parameter definition:
> >>
> >>  Parameters:
> >>  :
> >>  description: This is the name of a nova key pair that will be 
> >> used to
> >> ssh to the compute instance.
> >>  help: To learn more about nova key pairs click on this  >> href="/some/url/">help article.
> >
> > +1. A help string per parameter is a fantastic idea. Sounds like a nice
> > analog to doc-strings.
> 
> Agreed. The above is a nice, straightforward addition.
> 
> 
> 
> > +1 for grouping. Your use case is perfectly specified above. Love it.
> >
> > However, your format feels very forced. Integers for ordering? We have
> > lists for that.
> >
> > How about this:
> >
> > parameters:
> >db:
> >  type: group
> >  parameters:
> >- db_name
> >- db_username
> >- db_password
> >web_node:
> >  type: group
> >  parameters:
> >- web_node_name
> >- keypair
> >db_name:
> >  type: string
> >...
> >
> > This way we are just providing the groupings outside of the parameters
> > themselves. Using a list means the order is as it appears here.
> 
> Actually, even simpler than that...
> 
> parameters:
>db:
> - db_name:
>   description: blah
>   help: blah
> - db_username:
>   description: blah
>   help: blah
> 
> After all, can't we assume that if the parameter value is a list, then 
> it is a group of parameters?

+1

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Keystone][Oslo] Future of Key Distribution Server, Trusted Messaging

2013-11-30 Thread Clint Byrum
Excerpts from Adam Young's message of 2013-11-25 20:25:50 -0800:
> Back in the Day, Barbican was just one Service of Cloud Keep.  While I 
> would say that KDS belongs in the Cloud Keep, it is not the same as, and 
> should not be deployed with Barbican.  Is it possible to keep them as 
> separate services?  I think that is the right way to go.  Barbican is 
> for the  end users of Cloud, but KDS is not.  Does this make sense?
> 

They're doing the same fundamental thing for two different sets of users
with two overlapping use cases. Why would we implement two KDS services
for this?

I also don't like that the discussions suggested that because it would
be hard to get Barbican incubated/integrated it should not be used. That
is just crazy talk. TripleO merged with Tuskar because Tuskar is part of
deployment. 

Seems to me that pulling Barbican into the identity _program_, but still
as its own project/repo/etc. would solve that problem.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][hadoop][template] Does anyone has a hadoop template

2013-11-30 Thread Clint Byrum
Excerpts from Jay Lau's message of 2013-11-28 16:48:41 -0800:
> Hi,
> 
> I'm now trying to deploy a hadoop cluster with heat, just wondering if
> someone who has a heat template which can help me do the work.


Hi Jay, this is off topic for the openstack-dev mailing list. Please
re-post your question on the general OpenStack user discussion list:

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] DHCP Agent Reliability

2013-12-03 Thread Clint Byrum
Excerpts from Maru Newby's message of 2013-12-03 08:08:09 -0800:
> I've been investigating a bug that is preventing VM's from receiving IP 
> addresses when a Neutron service is under high load:
> 
> https://bugs.launchpad.net/neutron/+bug/1192381
> 
> High load causes the DHCP agent's status updates to be delayed, causing the 
> Neutron service to assume that the agent is down.  This results in the 
> Neutron service not sending notifications of port addition to the DHCP agent. 
>  At present, the notifications are simply dropped.  A simple fix is to send 
> notifications regardless of agent status.  Does anybody have any objections 
> to this stop-gap approach?  I'm not clear on the implications of sending 
> notifications to agents that are down, but I'm hoping for a simple fix that 
> can be backported to both havana and grizzly (yes, this bug has been with us 
> that long).
> 
> Fixing this problem for real, though, will likely be more involved.  The 
> proposal to replace the current wsgi framework with Pecan may increase the 
> Neutron service's scalability, but should we continue to use a 'fire and 
> forget' approach to notification?  Being able to track the success or failure 
> of a given action outside of the logs would seem pretty important, and allow 
> for more effective coordination with Nova than is currently possible.
> 

Dropping requests without triggering a user-visible error is a pretty
serious problem. You didn't mention if you have filed a bug about that.
If not, please do or let us know here so we can investigate and file
a bug.

It seems to me that they should be put into a queue to be retried.
Sending the notifications blindly is almost as bad as dropping them,
as you have no idea if the agent is alive or not.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] DHCP Agent Reliability

2013-12-03 Thread Clint Byrum
Excerpts from Maru Newby's message of 2013-12-03 19:37:19 -0800:
> 
> On Dec 4, 2013, at 11:57 AM, Clint Byrum  wrote:
> 
> > Excerpts from Maru Newby's message of 2013-12-03 08:08:09 -0800:
> >> I've been investigating a bug that is preventing VM's from receiving IP 
> >> addresses when a Neutron service is under high load:
> >> 
> >> https://bugs.launchpad.net/neutron/+bug/1192381
> >> 
> >> High load causes the DHCP agent's status updates to be delayed, causing 
> >> the Neutron service to assume that the agent is down.  This results in the 
> >> Neutron service not sending notifications of port addition to the DHCP 
> >> agent.  At present, the notifications are simply dropped.  A simple fix is 
> >> to send notifications regardless of agent status.  Does anybody have any 
> >> objections to this stop-gap approach?  I'm not clear on the implications 
> >> of sending notifications to agents that are down, but I'm hoping for a 
> >> simple fix that can be backported to both havana and grizzly (yes, this 
> >> bug has been with us that long).
> >> 
> >> Fixing this problem for real, though, will likely be more involved.  The 
> >> proposal to replace the current wsgi framework with Pecan may increase the 
> >> Neutron service's scalability, but should we continue to use a 'fire and 
> >> forget' approach to notification?  Being able to track the success or 
> >> failure of a given action outside of the logs would seem pretty important, 
> >> and allow for more effective coordination with Nova than is currently 
> >> possible.
> >> 
> > 
> > Dropping requests without triggering a user-visible error is a pretty
> > serious problem. You didn't mention if you have filed a bug about that.
> > If not, please do or let us know here so we can investigate and file
> > a bug.
> 
> There is a bug linked to in the original message that I am already working 
> on.  The fact that that bug title is 'dhcp agent doesn't configure ports' 
> rather than 'dhcp notifications are silently dropped' is incidental.
> 

Good point, I suppose that one bug is enough.

> > 
> > It seems to me that they should be put into a queue to be retried.
> > Sending the notifications blindly is almost as bad as dropping them,
> > as you have no idea if the agent is alive or not.
> 
> This is more the kind of discussion I was looking for.  
> 
> In the current architecture, the Neutron service handles RPC and WSGI with a 
> single process and is prone to being overloaded such that agent heartbeats 
> can be delayed beyond the limit for the agent being declared 'down'.  Even if 
> we increased the agent timeout as Yongsheg suggests, there is no guarantee 
> that we can accurately detect whether an agent is 'live' with the current 
> architecture.  Given that amqp can ensure eventual delivery - it is a queue - 
> is sending a notification blind such a bad idea?  In the best case the agent 
> isn't really down and can process the notification.  In the worst case, the 
> agent really is down but will be brought up eventually by a deployment's 
> monitoring solution and process the notification when it returns.  What am I 
> missing? 
>

I have not looked closely into what expectations are built in to the
notification system, so I may have been off base. My understanding was
they were not necessarily guaranteed to be delivered, but if they are,
then this is fine.

> Please consider that while a good solution will track notification delivery 
> and success, we may need 2 solutions:
> 
> 1. A 'good-enough', minimally-invasive stop-gap that can be back-ported to 
> grizzly and havana.
>

I don't know why we'd backport to grizzly. But yes, if we can get a
notable jump in reliability with a clear patch, I'm all for it.

> 2. A 'best-effort' refactor that maximizes the reliability of the DHCP agent.
> 
> I'm hoping that coming up with a solution to #1 will allow us the breathing 
> room to work on #2 in this cycle.
>

Understood, I like the short term plan and think long term having more
CPU available to process more messages is a good thing, most likely in
the form of more worker processes.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-03 Thread Clint Byrum
Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
> On 12/03/2013 06:13 PM, Ben Nemec wrote:
> > On 2013-12-03 17:09, Sean Dague wrote:
> >> On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
> >>> On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
>  On 2013-12-03 15:56, Sean Dague wrote:
> > This cinder patch - https://review.openstack.org/#/c/48935/
> >
> > Is blocked on failing upgrade because the updated oslo lockutils won't
> > function until there is a specific configuration variable added to the
> > cinder.conf.
> >
> > That work around is proposed here -
> > https://review.openstack.org/#/c/52070/3
> >
> > However I think this is exactly the kind of forward breaks that we
> > want
> > to prevent with grenade, as cinder failing to function after a rolling
> > upgrade because a config item wasn't added is exactly the kind of pain
> > we are trying to prevent happening to ops.
> >
> > So the question is, how is this done correctly so that a default
> > can be
> > set in the cinder code for this value, and it not require a config
> > change to work?
> >>>
> >>> You're absolutely correct, in principle - if the default value for
> >>> lock_path worked for users before, we should absolutely continue to
> >>> support it.
> >>>
>  I don't know that I have a good answer on how to handle this, but for
>  context this change is the result of a nasty bug in lockutils that
>  meant
>  external locks were doing nothing if lock_path wasn't set.  Basically
>  it's something we should never have allowed in the first place.
> 
>  As far as setting this in code, it's important that all of the
>  processes
>  for a service are using the same value to avoid the same bad situation
>  we were in before.  For tests, we have a lockutils wrapper
>  (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
>  that sets an environment variable to address this, but that only
>  works if all of the processes are going to be spawned from within
>  the same wrapper, and I'm not sure how secure that is for production
>  deployments since it puts all of the lock files in a temporary
>  directory.
> >>>
> >>> Right, I don't think the previous default really "worked" - if you used
> >>> the default, then external locking was broken.
> >>>
> >>> I suspect most distros do set a default - I see RDO has this in its
> >>> default nova.conf:
> >>>
> >>>   lock_path = /var/lib/nova/tmp
> >>>
> >>> So, yes - this is all terrible.
> >>>
> >>> IMHO, rather than raise an exception we should log a big fat warning
> >>> about relying on the default and perhaps just treat the lock as an
> >>> in-process lock in that case ... since that's essentially what it was
> >>> before, right?
> >>
> >> So a default of lock_path = /tmp will work (FHS says that path has to be
> >> there), even if not optimal. Could we make it a default value like that
> >> instead of the current default which is null (and hence the problem).
> > 
> > IIRC, my initial fix was something similar to that, but it got shot down
> > because putting the lock files in a known world writeable location was a
> > security issue.
> > 
> > Although maybe if we put them in a subdirectory of /tmp and ensured that
> > the permissions were such that only the user running the service could
> > use that directory, it might be acceptable?  We could still log a
> > warning if we wanted.
> > 
> > This seems like it would have implications for people running services
> > on Windows too, but we can probably find a way to make that work if we
> > decide on a solution.
> 
> How is that a security issue? Are the lock files being written with some
> sensitive data in them and have g or o permissions on? The sticky bit
> (+t) on /tmp will prevent other users from deleting the file.
> 

Right, but it won't prevent users from creating a symlink with the same
name.

ln -s /var/lib/nova/instances/x/image.raw /tmp/well.known.location

Now when you do

with open('/tmp/well.known.location', 'w') as lockfile:
  lockfile.write('Stuff')

Nova has just truncated the image file and written 'Stuff' to it.

The typical solution is to use a lock directory, /var/run/yourprogram,
that has restrictive enough permissions setup for your program to have
exclusive use of it, and is created by root at boot time. That is what
the packages do now.

It would be good if everybody agreed on a default, %(nova_home)/locks
or something, but root still must set it up with the right permissions
before the program can use it. IMO that is fine, your upgrade should
include a change to your init script/systemd unit/upstart job which
ensures that it exists before starting.

We could abstract this away with a nova-manage subcommand that is intended
to be run as root but can inspect the config file. That would allow for
simpler documentation and the pack

Re: [openstack-dev] creating a default for oslo config variables within a project?

2013-12-04 Thread Clint Byrum
Excerpts from Sean Dague's message of 2013-12-04 10:51:16 -0800:
> On 12/04/2013 11:56 AM, Ben Nemec wrote:
> > On 2013-12-04 06:07, Sean Dague wrote:
> >> On 12/03/2013 11:21 PM, Clint Byrum wrote:
> >>> Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
> >>>> On 12/03/2013 06:13 PM, Ben Nemec wrote:
> >>>>> On 2013-12-03 17:09, Sean Dague wrote:
> >>>>>> On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
> >>>>>>> On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
> >>>>>>>> On 2013-12-03 15:56, Sean Dague wrote:
> >>>>>>>>> This cinder patch - https://review.openstack.org/#/c/48935/
> >>>>>>>>>
> >>>>>>>>> Is blocked on failing upgrade because the updated oslo
> >>>>>>>>> lockutils won't
> >>>>>>>>> function until there is a specific configuration variable added
> >>>>>>>>> to the
> >>>>>>>>> cinder.conf.
> >>>>>>>>>
> >>>>>>>>> That work around is proposed here -
> >>>>>>>>> https://review.openstack.org/#/c/52070/3
> >>>>>>>>>
> >>>>>>>>> However I think this is exactly the kind of forward breaks that we
> >>>>>>>>> want
> >>>>>>>>> to prevent with grenade, as cinder failing to function after a
> >>>>>>>>> rolling
> >>>>>>>>> upgrade because a config item wasn't added is exactly the kind
> >>>>>>>>> of pain
> >>>>>>>>> we are trying to prevent happening to ops.
> >>>>>>>>>
> >>>>>>>>> So the question is, how is this done correctly so that a default
> >>>>>>>>> can be
> >>>>>>>>> set in the cinder code for this value, and it not require a config
> >>>>>>>>> change to work?
> >>>>>>>
> >>>>>>> You're absolutely correct, in principle - if the default value for
> >>>>>>> lock_path worked for users before, we should absolutely continue to
> >>>>>>> support it.
> >>>>>>>
> >>>>>>>> I don't know that I have a good answer on how to handle this,
> >>>>>>>> but for
> >>>>>>>> context this change is the result of a nasty bug in lockutils that
> >>>>>>>> meant
> >>>>>>>> external locks were doing nothing if lock_path wasn't set. 
> >>>>>>>> Basically
> >>>>>>>> it's something we should never have allowed in the first place.
> >>>>>>>>
> >>>>>>>> As far as setting this in code, it's important that all of the
> >>>>>>>> processes
> >>>>>>>> for a service are using the same value to avoid the same bad
> >>>>>>>> situation
> >>>>>>>> we were in before.  For tests, we have a lockutils wrapper
> >>>>>>>> (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
> >>>>>>>>
> >>>>>>>> that sets an environment variable to address this, but that only
> >>>>>>>> works if all of the processes are going to be spawned from within
> >>>>>>>> the same wrapper, and I'm not sure how secure that is for
> >>>>>>>> production
> >>>>>>>> deployments since it puts all of the lock files in a temporary
> >>>>>>>> directory.
> >>>>>>>
> >>>>>>> Right, I don't think the previous default really "worked" - if
> >>>>>>> you used
> >>>>>>> the default, then external locking was broken.
> >>>>>>>
> >>>>>>> I suspect most distros do set a default - I see RDO has this in its
> >>>>>>> default nova.conf:
> >>>>>>>
> >>>>>>>   lock_path = /var/lib/nova/tmp
> >>>>>>>
> >>>>>>> So, yes - this is al

Re: [openstack-dev] [Solum] MySQL Storage Engine

2013-12-04 Thread Clint Byrum
Excerpts from Paul Montgomery's message of 2013-12-04 12:04:06 -0800:
> TLDR: Should Solum log a warning if operators do not use the InnoDB
> storage engine with MySQL in Solum's control plane?
> 
> 
> Details:
> 
> I was looking at: https://review.openstack.org/#/c/57024/
> Models.py to be specific.
> 
> The default storage engine is InnoDB for MySQL which is good.  I took a
> quick look at the storage engines and only InnoDB seems reasonable for the
> Solum control plane (it is ACID complaint).  I assume that we'll all be
> coding towards an ACID compliant database for performance (not having to
> revalidate database writes and consistency and such) and ease of
> development.
> 
> If all of that is true, should we log a warning to the operator that they
> are using an untested and potentially problematic storage engine (which in
> a worst case scenario can corrupt their data)?  Should we even enable an
> operator to change the storage engine through configuration?  I think
> enabling that configuration is fine as long as we make sure that the
> operator knows that they are on their own with this unsupported
> configuration but I welcome thoughts from the group on this topic.
> 

Just assume MyISAM _does not exist_. It is 2013 for crying out loud.

If somebody accidentally uses MyISAM, point at them and laugh, but then
do help them pick up the pieces when it breaks.

In all seriousness, if you can force the engine to InnoDB, do that.
Otherwise, just ignore this. We are all consenting adults here and if
people cant' RTFM on MySQL, they shouldn't be storing data in it.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   3   4   5   6   7   8   9   10   >