On 10 September 2013 02:03, James Slagle <james.sla...@gmail.com> wrote: >> working on the scale-out story in much detail >> (I'm very interested in where you got the idea that >> all-nodes-identical was the scaling plan for TripleO - it isn't :)) > > It's just a misconception on my part. I was trying to get an understanding of > what a "2 machine/node undercloud in Full HA Mode" was. I've seen that > mentioned in some of the tripleo presentations I've watched on youtube and > such.
Ok, so we'll need to be clearer in future discussions - thanks! > What's the 2nd node in the undercloud? Is it more similar to the Leaf Node > proposal in Idea 1 I laid out...basically just enough services for Compute, > Networking, etc? The 2nd node would be identical, so that we get full HA. > What do you mean by Full HA Mode? The 2nd node serves as HA for the first, or > 2 additional HA nodes, making 4 nodes total? Or something else maybe :) ? By full HA I mean 'all services in HA mode' vs 'most services in HA'. We need HA for the bus, data store, APIs, endpoints - the works. This requires a minimum of 2 nodes, probably recommended 3 so that we don't have split brain situations; but beyond that we should be able to add capacity to specific services rather than duplicating everything: the big question for me is what rate we need to add capacity, to what services. >> The exact design of a scaled cluster isn't pinned down yet : I think >> we need much more data before we can sensibly do it: both on >> requirements- whats valuable for deployers - and on the scaling >> characteristics of nova baremetal/Ironic/keystone etc. > > That maybe answers my previous question then. The other node is not yet > defined. I think that makes sense given some of the higher level things you'd > like to see discussed first, goals, requirements, etc. Yup. >> I don't really follow some of the discussion in Idea 1 : but scaling >> out things that need scaling out seems pretty sensible. We have no >> data suggesting how many thousands machines we'll get per nova >> baremetal machine at the moment, so it's very hard to say what >> services will need scaling at what points in time yet : but clearly we >> need to support it at some scale. OTOH once we scale to 'an entire >> datacentre' the undercloud doesn't need to scale further : I think >> having each datacentre be a separate deployment cloud makes a lot of >> sense. > > The point of Idea 1 was somewhat 2 fold: > > First, there is another image type, which we called the Leaf Node. It's a > smaller set of services, not the whole Undercloud. Whatever is necessary to > scale to larger workloads. E.g., if the baremetal Compute driver does > eventually prove to be a bottleneck, it would obviously incude that. > > Second, as hardware is grouped into Logical Racks (could be multiple physical > racks or a subset of hardware across physical racks), you deploy a Leaf Node > in > the Logical Rack as well to act as the Undercloud's management interface (so > to > speak) to that logical rack. This way, if you *wanted* to have some > additional > network isolation in the logical rack only the Leaf Nodes needs connectivity > back to the main Undercloud node (with all services). > > Not saying that deploying a Leaf Node would be a hard requirement for each > logical rack, but more of a best practice or reference implementation type > approach. I see. Ok - so the issue would be that if there is limited connectivity for deployment orchestration, how does the first thing in that rack get setup? I think we can simply define it as: * If you can PXE deploy into that rack from elsewhere, then that rack is part of the $elsewhere undercloud. * If you cannot, then it is a new undercloud. This will be a lot easier to reason about, for all that it may be harder to deliver a single overcloud across both racks: it puts the scheduling, configuration, how-to-preserve-ha-within-that-rack concerns all clearly where they should be, and raises interesting questions about heat for cross-cloud :). >> This leads me to suggest a very simple design: >> - one undercloud per fully-reachable-fabric-of-IPMI control. Done :) >> - we gather data on performance scaling as node counts scales > > What type of hardware access does the team have to do any sort of performance > scaling testing? We've got 40ish production-cloud-scale machines w/10Gbps ethernet on a gosh-I-don't-know-how-fast backplane today, though they are currently running a long lived proof of concept. We'll have full access again at the end of the month, and I'm going to see if I can reclaim some for the sprint. There are also some rather large testing labs w/in HP that we can use with prior arrangement; once we've got disk-injection (our current known scale-defeater) disableable in nova, I intend to arrange a scale test. > I can ask around and see what I can find. > > Alternatively, we could probably work on some sort of performance test suite > that tested without a bunch of physical hardware. E.g, you don't necessarily > need a bunch of distinct nodes to test something like how many iscsi targets > can Nova Compute reasonably populate at once, etc. I think a perf test suite is a great idea. Probably wants to be part of Tempest. That said, there are /significant/ performance differences between virt and physical (and depending on exact config they can be either overly-fast or overly-slow) so at most I'd want to use them as a flag for actual physical testing. For instance, deploying to baremetal from a kvm hosted seed node was an OOM slower than deploying baremetal->baremetal. I didn't track down the cause at the time, but it looked like bad jumbo frame support in the virt network datapath causing low utilisation and overly high physical packet counts. -Rob -- Robert Collins <rbtcoll...@hp.com> Distinguished Technologist HP Converged Cloud _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev