Re: [openstack-dev] [Ironic] Node groups and multi-node operations

Clint Byrum Sun, 26 Jan 2014 12:17:49 -0800

Excerpts from Devananda van der Veen's message of 2014-01-26 10:27:36 -0800:
> On Sat, Jan 25, 2014 at 7:11 AM, Clint Byrum <cl...@fewbar.com> wrote:
> >
> > Excerpts from Robert Collins's message of 2014-01-25 02:47:42 -0800:
> > > On 25 January 2014 19:42, Clint Byrum <cl...@fewbar.com> wrote:
> > > > Excerpts from Robert Collins's message of 2014-01-24 18:48:41 -0800:
> > >
> > > >> > However, in looking at how Ironic works and interacts with Nova, it
> > > >> > doesn't seem like there is any distinction of data per-compute-node
> > > >> > inside Ironic.  So for this to work, I'd have to run a whole bunch of
> > > >> > ironic instances, one per compute node. That seems like something we
> > > >> > don't want to do.
> > > >>
> > > >> Huh?
> > > >>
> > > >
> > > > I can't find anything in Ironic that lets you group nodes by anything
> > > > except chassis. It was not a serious discussion of how the problem would
> > > > be solved, just a point that without some way to tie ironic nodes to
> > > > compute-nodes I'd have to run multiple ironics.
> > >
> > > I don't understand the point. There is no tie between ironic nodes and
> > > compute nodes. Why do you want one?
> > >
> >
> > Because sans Ironic, compute-nodes still have physical characteristics
> > that make grouping on them attractive for things like anti-affinity. I
> > don't really want my HA instances "not on the same compute node", I want
> > them "not in the same failure domain". It becomes a way for all
> > OpenStack workloads to have more granularity than "availability zone".
> 
> Yes, and with Ironic, these same characteristics are desirable but are
> no longer properties of a nova-compute node; they're properties of the
> hardware which Ironic manages.
>


I agree, but I don't see any of that reflected in Ironic's API. I see
node CRUD, but not filtering or scheduling of any kind.

> In principle, the same (hypothetical) failure-domain-aware scheduling
> could be done if Ironic is exposing the same sort of group awareness,
> as long as the nova 'ironic" driver is passing that information up to
> the scheduler in a sane way. In which case, Ironic would need to be
> representing such information, even if it's not acting on it, which I
> think is trivial for us to do.
> 
> >
> > So if we have all of that modeled in compute-nodes, then when adding
> > physical hardware to Ironic one just needs to have something to model
> > the same relationship for each physical hardware node. We don't have to
> > do it by linking hardware nodes to compute-nodes, but that would be
> > doable for a first cut without much change to Ironic.
> >
> 
> You're trading failure-domain awareness for fault-tolerance in your
> control plane. by binding hardware to nova-compute. Ironic is designed
> explicitly to decouple the instances of Ironic (and Nova) within the
> control plane from the hardware it's managing. This is one of the main
> shortcomings of nova baremetal, and it doesn't seem like a worthy
> trade, even for a first approximation.
> 
> > > >> The changes to Nova would be massive and invasive as they would be
> > > >> redefining the driver api....and all the logic around it.
> > > >>
> > > >
> > > > I'm not sure I follow you at all. I'm suggesting that the scheduler have
> > > > a new thing to filter on, and that compute nodes push their unique ID
> > > > down into the Ironic driver so that while setting up nodes in Ironic one
> > > > can assign them to a compute node. That doesn't sound massive and
> > > > invasive.
> 
> This is already being done *within* Ironic as nodes are mapped
> dynamically to ironic-conductor instances; the coordination for
> failover/takeover needs to be improved, but that's incremental at this
> point. Moving this mapping outside of Ironic is going to be messy and
> complicated, and breaks the abstraction layer. The API change may seem
> small, but it will massively overcomplicate Nova by duplicating all
> the functionality of Ironic-conductor in another layer of the stack.
> 

Can you point us to the design for this? I didn't really get that
from browsing the code and docs, and I gave up trying to find a single
architecture document after very little effort.

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] Node groups and multi-node operations

Reply via email to