Hi Yuriy, I like idea #4 (building task management functionality into a separate console driver). I think this was suggested at the PTG, and it's good because it fits into the existing model ironic has for handling console.
Thanks, Mario On Fri, Mar 10, 2017 at 10:42 AM, Yuriy Zveryanskyy <yzveryans...@mirantis.com> wrote: > Hi all. > > Hardware nodes consoles have some specific: limited number of > > concurrent console sessions (often to 1) that can be established. > > There are some issues (described below) due to conflict between > > distributed ironic conductors services and local console processes. > > This affect only case with local console processes, currently > > shellinabox and socat for example. > > There are some possible "global" solutions: > > 1) Pluggable internal task API [1], currently rejected by community; > > 2) Non-pluggable internal task API that uses external service (there > > is not necessary service currently in OpenStack); > > 3) Custom distributed process management based on ssh access > > between ironic conductor hosts (looks like a hack); > > 4) New console interface drivers which implements tasks management > > internally (like "k8s_shellinabox", "k8s_socat"). > > And partial solutions (some of them proposed below) are possible. > > In multi-conductor environment ironic conductor process can be > > died/stopped/blocked (removed) or started/restarted (added). > > Possible cases: > > 1) Conductor removed > > a) gracefully stopped. Some daemon processes like shellinabox > > for consoles can continue to run. This issue can be fixed currently > > as separate bug. > > b) died/killed. Daemon processes can continue to run. This issue can > > be fixed only by distributed tasks management ("global" solutions above). > > c) all host with conductor died. No fix needed. > > 2) Conductor added/restarted > > New conductor try to start processes for enabled consoles, but currently > > processes on conductor hosts that works with these nodes before not > > stopped [2]. I see two possible solution for this issue: > > 1) "Untakeover" periodic task for stopping console processes. > > For this solution we should not stop non-local consoles. > > 2) Do not stop process on old conductor. Use redefined RPC routing > > (based on saved into DB conductor that started console) on API side > > for set console and wait stopping via API. This routing should also > > ignore dead conductors. > > > If you have some ideas please leave comments. > > > [1] https://review.openstack.org/#/c/431605/ > > [2] https://bugs.launchpad.net/ironic/+bug/1632192 > > > Yuriy Zveryanskyy > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev