Hello,

Mixed-criticality and safety-critical systems under development need
support for timely boot of multiple domains at system launch -- “Initial
Domains” -- with static assignment of resources between them, each isolated
from the others and without runtime dependency on a “dom0”-type domain.

The Xen hypervisor currently contains a section of fixed policy logic for
creation of a single, highly-privileged domain “dom0” at system boot.

We would like to establish a shared understanding of best practice for
configuration and deployment of disaggregated system launch with multiple
domains, for production systems, on both x86 and ARM platforms.

Correctness and integrity of system launch is fundamental to system
security.

Maximizing the commonality between deployed Xen systems enables pooling of
risk and widens the applicability of testing that is performed.

## Requirements:

* Enable fast launch of multiple domains at system boot.
* Obtain the boot materials for Initial Domains -- config, kernel image,
optional ramdisk -- from host boot binaries.
* A system initialization process that is appropriate for both ARM and x86.
* Support minimization of audit work required for safety critical
certification
    * Favour reduction of code size and complexity of highest-privilege
components.
    * Decouple and isolate logic in support of simplifying validation by
enabling reasoning about the interfaces between components.
* Remove the policy logic for initial distribution of privileges from the
hypervisor.
    * Aim to simplify and consolidate the hypervisor code for system launch.
    * Allow flexibility of different initial domain configurations by
disaggregation to an external component.
    * Separate the mechanisms of privilege assignment, necessarily within
the hypervisor, from the implementation of policy logic assigning them to
specific domains.
* Support measured launch (e.g. TXT/SKINIT) and verified launch (e.g.
Secure Boot) to include the ability to measure/verify any or all Initial
Domains.
* Support for manageable processes for the system boot binaries.
    * ie. Consider: creation of files, updates to files, build-time
tooling, run-time tooling, dependencies of components.
    * Note that Device Tree tooling is foreign for booting x86.
* Do not make Initial Domain creation depend upon parsing complex data
structures, such as a Device Tree, within the hypervisor.
    * The parsing is unwanted attack surface.
    * A static, unchanging Device Tree binary used just for
hardware-enablement increases commonality across deployments, and is easier
to place trust in than one that changes according to the specific installed
software configuration.
* Support configuration of XSM security labels for the Initial Domains.

## Proposed approach:

Provide support for the hypervisor launching an alternative configuration
for the first domain launched, revising the all-privileged Dom0 to it being
a more capability-constrained “Boot Domain”, DomB.

DomB is responsible for starting a set of domains from the material it
discovers within its ramdisk. Once they are running, DomB terminates with
launch success status indicator, in a step we refer to as “Exit Xen Boot
Services”. The domain termination makes it easy to verify that all DomB
privileges have been dropped.

The Boot Domain need not be granted hardware access capabilities itself;
but it does require the capability to create a domain that has each
privilege. Since DomB is granted no access to hardware, it is possible for
DomB to operate as a PVH domain - and we would advocate for this.

In a future iteration of the work: when DomB creates domains and delegates
privileges to them, DomB will lose the ability to further delegate the same
privileges to other domains. This atomic transfer will be enforced by the
hypervisor: a ratchet of decreasing capability.

The Boot Domain approach enables flexibility of configuration in system
bring up, with the same approach applicable on both ARM and x86
architectures, while minimizing increase in the hypervisor code base. DomB
can be implemented in a small, single-purpose kernel, e.g. mini-os or LK,
that can easily be audited for certification while maximizing the isolation
of the platform launch process. It provides flexibility to support multiple
models to include but not limited to: single, all privileged domain, in
traditional dom0-style; split privileged domains (control/hardware
domains); disaggregated domains; static partitioned domains; or high
density, cloned domains (e.g. container farm or fast-fork honeypot).

## Q: How does this change the system boot configuration files?

#### Existing practice:

The current dom0 initialization materials are:

* optional entries on the Xen command line
* an OS kernel image
* an OS kernel command line, supplied by the bootloader
* an optional OS kernel initial ramdisk image

#### With the proposal:

The domB initialization materials are:

* optional entries on the Xen command line
* an OS kernel image
* an OS kernel command line, supplied by the bootloader
* the domB initial ramdisk image

The domB initial ramdisk image will contain:

* For each Initial Domain:
    * VM configuration file, including (amongst other things):
        * any OS kernel command line
        * standard VM config items, such as RAM allocation, vCPU
configuration
        * XSM security label for the domain
    * For PV/PVH domains:
        * an OS kernel image
        * an optional ramdisk for the VM
    * An init program to enable the domB kernel to start the Initial
Domains.
        * The domains may be started in parallel, or in sequence, as
required - eg. some domains may depend upon storage backends being
available, and so be started later in the sequence.

The XSM policy will require updating for a domB system.

For booting into the traditional highly-privileged dom0 model with this
launch process, the most visible effect is migration of the dom0
configuration state and kernel boot binaries into the domB ramdisk.
Documentation will be necessary to communicate how to work with this
configuration.

## Q: How does this structure affect the TCB?

The Xen system components that must be within the TCB of each Initial
Domain are:

* The hypervisor.
* The Boot Domain, domB, which has exited and cannot be restarted without
host reboot.

It enables the start of VMs with a shorter chain of trust and substantially
reduced volume of code in their TCB than available with the traditional
dom0 model.
Note that running VMs with privileges over other domains will impact the
size of the TCB of the VMs that they have privilege over. XSM is able to
perform fine-grained confinement which can address this.

## References

RFC for “dom0less step 1”, with aim of preparation for Safety Certification
https://lists.xenproject.org/archives/html/xen-devel/2018-06/msg00982.html

Domain Builder
https://lists.xenproject.org/archives/html/xen-devel/2014-03/msg00320.html

Hardware Domain support
https://lists.xenproject.org/archives/html/xen-devel/2014-03/msg03556.html


thanks,

Christopher
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to