Re: [PATCH v9 1/3] xen/domain: unify domain ID allocation

2025-06-07 Thread Julien Grall

Hi Denis,

On 06/06/2025 07:55, dm...@proton.me wrote:

On Thu, Jun 05, 2025 at 10:58:48PM +0100, Julien Grall wrote:

+if ( domid == DOMID_INVALID )
+panic("Error allocating ID for domain %s\n", dt_node_name(node));
+
+d = domain_create(domid, &d_cfg, flags);
   if ( IS_ERR(d) )
   panic("Error creating domain %s (rc = %ld)\n",
 dt_node_name(node), PTR_ERR(d));
diff --git a/xen/common/domain.c b/xen/common/domain.c
index abf1969e60..ae0c44fcbb 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -66,6 +66,10 @@ DEFINE_RCU_READ_LOCK(domlist_read_lock);
   static struct domain *domain_hash[DOMAIN_HASH_SIZE];
   struct domain *domain_list;

+/* Non-system domain ID allocator. */
+static DEFINE_SPINLOCK(domid_lock);
+static DECLARE_BITMAP(domid_bitmap, DOMID_FIRST_RESERVED);
+
   /*
* Insert a domain into the domlist/hash.  This allows the domain to be 
looked
* up by domid, and therefore to be the subject of hypercalls/etc.
@@ -1449,6 +1453,8 @@ void domain_destroy(struct domain *d)

   TRACE_TIME(TRC_DOM0_DOM_REM, d->domain_id);

+domid_free(d->domain_id);
+
   /* Remove from the domlist/hash. */
   domlist_remove(d);

@@ -2405,6 +2411,54 @@ domid_t get_initial_domain_id(void)
   return hardware_domid;
   }

+domid_t domid_alloc(domid_t domid)
+{
+spin_lock(&domid_lock);
+
+if ( domid < DOMID_FIRST_RESERVED )
+{
+if ( __test_and_set_bit(domid, domid_bitmap) )
+domid = DOMID_INVALID;
+}
+else
+{
+static domid_t domid_last;
+/* NB: account for late hwdom case, skip ID#0 */


I am somewhat confused with this comment. For the late hwdom case, I
thought we were using a non-zero ID. Dom0 should also always be the
first dom0 to be reserved. Can you clarify?


My current understanding is:
- the ID of "domain 0" (privileged domain) can be overridden at the boot-time
   via hardware_domid parameter.
> > - there's only one reserved (and configurable) domain ID == 
hardware_domid in

   the allocation range (which is 0 by default).
> > - get_initial_domain_id() returns the reserved domain ID value 
(which is

   used in the in the follow on change to keep the change isolated).

Is my understanding correct?


I have replied yesterday night on a separate thread about this behavior 
[1]. Rather than duplicating it, I would suggest to move the 
conversation there.


In short, I believe the late domain support was recently broken.

Cheers,

[1] 
https://lore.kernel.org/xen-devel/20250528225030.2652166-1-dmuk...@ford.com/T/#mdcdf3802a913859243ff6ce841

445cfab265145f

--
Julien Grall




Re: xen_acpi_processor driver is bound to dom0 vcpu count

2025-06-07 Thread Elliott Mitchell
On Fri, May 23, 2025 at 12:22:44AM +0100, m...@alex0.net wrote:
> I think I’ve just uncovered a rather nasty bug in the xen_acpi_processor 
> driver in dom0. If the vcpu count in dom0 is set to anything other than the 
> exact number of physical cores, the xen_acpi_processor kernel driver will 
> fail to upload the C-state information for those cores to Xen, resulting in 
> Xen never knowing about the C-states, which significantly impacts battery 
> life on mobile platforms.
> 

While I dislike this bug, I don't think it qualifies as "nasty" since it
doesn't severely break anything.  Your system ends up using rather more
power than it should, but everything else continues to works as expected.

The simplest workaround is to not use dom0_max_vcpus, and instead do
`xl vcpu-set 0 #` to offline vCPUs after boot.  This leaves domain 0
allocating per-processor memory for unused processors, but does reduce
domain 0's vCPU use.

I believe this is partially bugs with Xen's ACPI support, but also an
issue with the Linux kernel module.  I've observed patches going by which
appeared aimed towards this problem, but for now this remains.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445