On 19/12/2018 22:33, Tamas K Lengyel wrote: > On Wed, Dec 19, 2018 at 11:52 AM Petre Pircalabu > <ppircal...@bitdefender.com> wrote: >> This patchset is a rework of the "multi-page ring buffer" for vm_events >> patch based on Andrew Cooper's comments. >> For synchronous vm_events the ring waitqueue logic was unnecessary as the >> vcpu sending the request was blocked until a response was received. >> To simplify the request/response mechanism, an array of slotted channels >> was created, one per vcpu. Each vcpu puts the request in the >> corresponding slot and blocks until the response is received. >> >> I'm sending this patch as a RFC because, while I'm still working on way to >> measure the overall performance improvement, your feedback would be a great >> assistance. > Generally speaking this approach is OK, but I'm concerned that we will > eventually run into the same problem that brought up the idea of using > multi-page rings: vm_event structures that are larger then a page. > Right now this series adds a ring for each vCPU, which does mitigate > some of the bottleneck, but it does not really address the root cause. > It also adds significant complexity as the userspace side now has to > map in multiple rings, each with its own event channel and polling > requirements.
I haven't looked at the series in detail yet, but there should explicitly be no issue if/when sizeof(vm_event) exceeds 4k. In practice there are many reasons why letting it get that large will be a problem. The size of the sync "ring" (slotted mapping?) is exactly sizeof(vm_event) * d->max_vcpus, (which is a function of the interface version) and should be mapped as a single contiguous block of pages. The resource foreign map interface was designed with this case in mind. The async ring is a traditional ring, and (eventually?) wants to become a caller-specified size to make it large enough for any reasonable quantity of queued requests, at which point it can switch to lossy semantics. There are three angles for doing this work. 1) Avoid cases where the guests balloon driver can interfere with attaching the monitor ring. 2) Deal more scalably with large number of vcpus/events. 3) Remove the final case where Xen needs to wait for the queue to drain, which in turn lets us delete the waitqueue infrastructure (which is horrible in its own right) and breaks one of the safety mechanisms that live-patching relies on. Frankly, option 3 is the one I care most about (because I can't safely livepatch a system using introspection, and XenServer supports both of these things), whereas the first two are concrete improvements for userspace using vm_event APIs. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel