On 24.06.25 16:51, Andriy Sultanov wrote:
Currently, as far as I am aware, the ability of xenstore clients to properly
handle and detect batch updates is somewhat lacking. Transactions are not
directly visible to the clients watching a particular directory - they will
receive a lot of individual watch_event's once the transaction is committed,
without any indication when such updates are going to end.

Clients such as xenopsd from the xapi toolstack are reliant on xenstore to
track their managed domains, and a flood of individual updates most often
results in a flood of events raised from xenopsd to xapi (There are
consolidation mechanisms implemented there, with updates getting merged
together, but if xapi picks up update events from the queue quickly enough, it
will only get more update events later)

The need for batching is fairly evident from the fact that XenServer's Windows
PV drivers, for example, adopted an ad-hoc "batch" optimization (not documented
anywhere, of course), where some sequence of writes is followed by a write of
the value "1" to "data/updated". This used to be honoured by xapi, which would
not consider the guest agent update done until it received notice of such a
"batch ended" update, but it caused xapi to miss updates that were not followed
by such a write, so xapi now ignores this ad-hoc batching. One could imagine
many workarounds here (for example, some sort of a mechanism where xenopsd
stalls an update for a second to see if any more related updates show up and
only then notifies xapi of it, with obvious trade-offs), but IMO it could be
worth considering making this easier on the xenstore side for different
use-cases.

Suggestion:
WATCH_EVENT's req_id and tx_id are currently 0. Could it be possible, for
example, to modify this such that watch events coming as a result of a
transaction commit (a "batch") have tx_id of the corresponding transaction
and req_id of, say, 2 if it's the last such watch event of a batch and 1
otherwise? Old clients would still ignore these values, but it would allow
some others to detect if an update is part of a logical batch that doesn't end
until its last event.

Is this beyond the scope of what xenstored wants to do? From a first glance,
this does not seem to introduce obvious unwanted information leaks either, but
I could be wrong. I would love to hear if this is something that could be
interesting to others and if this could be considered at all.

The main reason for the large number of watch events after a transaction is
the fact that the watch for e.g. detecting the addition of a new block device
will be set on a node being common for all potential block devices handled
by the watcher. This results in a watch event for each single node modified
below this node, which are usually quite a lot even when only adding a single
device.

The solution for this problem is NOT to batch all the events and to ignore the
majority of those events, but to avoid creating most of those events.

For this reason the Xenstore protocol has been extended to allow for limiting
the number of node levels below a watched node to be relevant for a watch to
fire.

What is missing so far are Xenstore implementations to support this feature,
and Xenstore users to make use of it. I'm working on supporting this in
C xenstored, but due to other urgent work this will probably land upstream only
in the Xen 4.22 time frame, probably together with Xen tools (libxl) making use
of this feature.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to