On 13.10.23 00:36, Stefano Stabellini wrote:
On Thu, 12 Oct 2023, George Dunlap wrote:
Stop tinkering in the hope that it hides the problem.  You're only
making it harder to fix properly.

Making it harder to fix properly would be a valid reason not to commit
the (maybe partial) fix. But looking at the fix again:

diff --git a/tools/xenstored/domain.c b/tools/xenstored/domain.c
index a6cd199fdc..9cd6678015 100644
--- a/tools/xenstored/domain.c
+++ b/tools/xenstored/domain.c
@@ -989,6 +989,7 @@ static struct domain *introduce_domain(const void *ctx,
                 talloc_steal(domain->conn, domain);

                 if (!restore) {
+                       domain_conn_reset(domain);
                         /* Notify the domain that xenstore is available */
                         interface->connection = XENSTORE_CONNECTED;
                         xenevtchn_notify(xce_handle, domain->port);
@@ -1031,8 +1032,6 @@ int do_introduce(const void *ctx, struct connection *conn,
         if (!domain)
                 return errno;

-       domain_conn_reset(domain);
-
         send_ack(conn, XS_INTRODUCE);

It is a 1-line movement. Textually small. Easy to understand and to
revert. It doesn't seem to be making things harder to fix? We could
revert it any time if a better fix is offered.

Maybe we could have a XXX note in the commit message or in-code
comment?

It moves a line from one function (do_domain_introduce()) into a
completely different function (introduce_domain()), nested inside two
if() statements; with no analysis on how the change will impact
things.

I am not the original author of the patch, and I am not the maintainer
of the code, so I don't feel I have the qualifications to give you the
answers you are seeking. Julien as author of the patch and xenstore
reviewer might be in a better position to answer. Or Juergen as xenstore
maintainer.

I did already provide some feedback when the patch was sent the first time
in May.


 From what I can see the patch is correct.

You removed the dom0 special casing again, which I asked for to add back
then. And I still think there are missing barriers (at least for Arm).


We are removing a call to domain_conn_reset in do_introduce.
We are adding a call to domain_conn_reset in introduce_domain, which is
called right before in introduce_domain. Yes there are 2 if statements
but the domain_conn_reset is added in the right location: the
non-already-introduced non-restore code path.


Are there any paths through do_domain_introduce() that now *won't* get
a domain_conn_reset() call?  Is that OK?

Yes, the already-introduced and the restore code paths. The operations in
the already-introduced or the restore code paths seem simple enough not
to require a domain_conn_reset. Julien and Juergen should confirm.


Is introduce_domain() called in other places?  Will those places now
get extra domain_conn_reset() calls they weren't expecting?  Is that
OK?

introduce_domain is called by dom0_init, but I am guessing that dom0 is
already-introduced so it wouldn't get an extra domain_conn_reset. Julien
and Jurgen should confirm.

I don't think this is correct. dom0 will only be introduced via dom0_init().


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to