Hi everyone,

I want to bring domain restart question for a discussion. It originates
from DomD restart, but the solution I am about to offer can be quite
generic.

Problem is, domain specification currently holds only frontend info, which
is used to generate both frontend and backend entries for a device; that
means that backend xenstore data is handled not by a domain that owns
backends, but by a domain that has frontends linked to these backends. As a
result, reboot of any domain with backends without reboot of the
corresponding frontend domains is impossible.

This is wrong on many levels, but the main thing is: some domains don't
know something they should (do they have any backends) and some know
something they should not (there are backends for their frontends in
different domains).

I propose following change: make domain "require" and "provide" interfaces
visible (think CORBA), hold connections between the two in the priveleged
domain (where toolstack is, think controller from the MVC idiom). With this
change domains (except for Dom0 which is a special case) can be rebooted in
any order whatsoever, and frontend/backend link can be adjusted as a static
config or during runtime (e.g. if hardware rendering backend hangs, switch
to software rendering to avoid glitches). However, it requires change in
the libxl internal device representation (device should not be a
frontend/backend pair any more) and config format change, which breaks
backwards compatibility.

That is, I want domain configuration hold records on both frontends (what
this domain require) and backends (what this domain provides) and libxl to
create corresponding xenstore branches separately. Moreover, I'd like to
have frontend/backend connection information be held in a different config
belonging to Dom0, so that on any domain reboot (or any exceptional
situation like watchdog failure) supervisor (Dom0) can use this information
to initiate a reconnect.

And, as we talk about libxl refactoring, I'd like to state one point more:
code duplication. Libxl support for a split-driver model consists of an
declarative IDL device specification, xenstore read, xenstore write, config
read, config write, xl args read, JSON read/write and device chain of
responsibility with async device creation. The only thing IDL is used for
is type and JSON read/write code generation, everything else is an
error-prone hand-written duplicated code.

Why won't we generate as much as we can? That means generation of xenstore
read, xenstore write, config read, config write and xl args read - these
all directly depend on device IDL specification. If we already have
external code generation tool, why not use it to full extent instead of
writing all this serialization/deserialization code manually (and in
different styles - e.g. block device is the only one that uses lexx, and
xl_cmdimpl.c parse_config_data implementations differs from device to
device quite a lot)?

As a matter of fact, I'd be doing some work in this general direction
because we need DomD restart anyway and libxl boilerplate is kinda messy
(we have ~12 devices and xl/libxl interface patches for them are almost
copy and pasted), but I would like to hear as much criticism and ideas as
possible. It would be nice if we can come out from this discussion with
something potentially upstreamable.

Suikov Pavlo
GlobalLogic
P +x.xxx.xxx.xxxx  M +38.066.667.1296  S psujkov
www.globallogic.com
<http://www.globallogic.com/>
http://www.globallogic.com/email_disclaimer.txt
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to