Hi Spyros,

AFAIK we already have special session slot related with your topic.
So thank you for the providing all items here.
Rabi, can we add link on this mail to etherpad ? (it will save our time
during session :) )

On 10 October 2016 at 18:11, Spyros Trigazis <[email protected]> wrote:

> Hi heat and magnum.
>
> Apart from the scalability issues that have been observed, I'd like to
> add few more subjects to discuss during the summit.
>
> 1. One nested stack per node and linear scale of cluster creation
> time.
>
> 1.1
> For large stacks, the creation of all nested stack scales linearly. We
> haven't run any tested using the convergence-engine.
>
> 1.2
> For large stacks, 1000 nodes, the final call to heat to fetch the
> IPs for all nodes takes 3 to 4 minutes. In heat, the stack has status
> CREATE_COMPLETE but magnum's state is updated when this long final
> call is done. Can we do better? Maybe fetch only the master IPs or
> get he IPs in chunks.
>
> 1.3
> After the stack create API call to heat, magnum's conductor
> busy-waits heat with a thread/cluster. (In case of a magnum conductor
> restart, we lose that thread and we can't update the status in
> magnum). Investigate better ways to sync the status between magnum
> and heat.
>
> 2. Next generation magnum clusters
>
> A need that comes up frequently in magnum is heterogeneous clusters.
> * We want to able to create cluster on different hardware, (e.g. spawn
>   vms on nodes with SSDs and nodes without SSDs or other special
>   hardware available only in some nodes of the cluster FPGA, GPU)
> * Spawn cluster across different AZs
>
> I'll describe briefly our plan here, for further information we have a
> detailed spec under review. [1]
>
> To address this issue we introduce the node-group concept in magnum.
> Each node-group will correspond to a different heat stack. The master
> nodes can be organized in one or more stacks, so as the worker nodes.
>
> We investigate how to implement this feature. We consider the
> following:
> At the moment, we have three template files, cluster, master and
> node, and all three template files create one stack. The new
> generation of clusters will have a cluster stack containing
> the resources in the cluster template, specifically, networks, lbaas
> floating-ips etc. Then, the output of this stack would be passed as
> input to create the master node stack(s) and the worker nodes
> stack(s).
>
> 3. Use of heat-agent
>
> A missing feature in magnum is the lifecycle operations in magnum. For
> restart of services and COE upgrades (upgrade docker, kubernetes and
> mesos) we consider using the heat-agent. Another option is to create a
> magnum agent or daemon like trove.
>
> 3.1
> For restart, a few systemctl restart or service restart commands will
> be issued. [2]
>
> 3.2
> For upgrades there are three scenarios:
> 1. Upgrade a service which runs in a container. In this case, a small
>    script that runs in each node is sufficient. No vm reboot required.
> 2. For an ubuntu based image or similar that requires a package upgrade
>    a similar small script is sufficient too. No vm reboot required.
> 3. For our fedora atomic images, we need to perform a rebase on the
>    rpm-ostree files system which requires a reboot.
> 4. Finally, a thought under investigation is replacing the nodes one
>    by one using a different image. e.g. Upgrade from fedora 24 to 25
>    with new versions of packages all in a new qcow2 image. How could
>    we update the stack for this?
>
> Options 1. and 2. can be done by upgrading all worker nodes at once or
> one by one. Options 3. and 4. should be done one by one.
>
> I'm drafting a spec about upgrades, should be ready by Wednesday.
>
> Cheers,
> Spyros
>
> [1] https://review.openstack.org/#/c/352734/
> [2] https://review.openstack.org/#/c/368981/
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Regards,
Sergey.
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to