Re: Event delivery and "domain blocking" on PVHv2

Martin Lucina Mon, 15 Jun 2020 08:53:27 -0700

On 2020-06-15 17:03, Roger Pau Monné wrote:

This way of event channel injection is slitgly hackish, and I would
recommend using HVMOP_set_evtchn_upcall_vector, that way vectors will
be properly routed using the lapic.


Using HVM_PARAM_CALLBACK_TYPE_VECTOR vectors are injected without
setting the IRR/ISR bit in the lapic registers.

I picked HVM_PARAM_CALLBACK_TYPE_VECTOR since that seemed like theeasiest option for a uniprocessor, PVH-only guest.

What does <vector> mean in the context ofHVMOP_set_evtchn_upcall_vector? If it's an APIC vector number (sorry,not too familiar with the post-legacy i8259 world), does that imply thatif I use HVMOP_set_evtchn_upcall_vector I need to do some APICinitialisation / programming?

All I need for Solo5/Mirage purposes is to have the upcall land on IDTvector #32 or above.

Questions:
1. Being based on the Solo5 virtio code, the low-level setup code isdoingthe "usual" i8259 PIC setup, to remap the PIC IRQs to vectors 32 andabove.
Should I be doing this initialisation for Xen PVH at all?
Hm, there are no IO-APICs (or legacy PICs) on a PVH domU, so there's
not much to route. If Solo5 is thinking it's somehow configuring them
it's likely writing to some hole in memory, or to some RAM.

Solo5 only has a very primitive understanding of hardware interrupts, soit's only configuring the legacy PICs via port IO.


IO-APIC presence is signaled on the ACPI MADT table on PVH domU.

Hmm, it'd be very unfortunate if I had to deal with ACPI, so here'shoping that I don't actually need to touch the APIC.

I'm not interested
in using the PIC for anything, and all interrupts will be deliveredvia Xen
event channels.
2. Related to the above, the IRQ handler code is ACKing the interruptafterthe handler runs. Should I be doing that? Does ACKing "IRQ" 0 on thePIChave any interactions with Xen's view of event channels/pendingupcalls?
Which kind of ACking it's doing? Is it writing to the lapic EOI
register? If so that would be OK when using
HVMOP_set_evtchn_upcall_vector. If using
HVM_PARAM_CALLBACK_TYPE_VECTOR there's nothing to Ack I think.


Legacy PIC EOI via port IO.

Next, for a PVHv2, uniprocessor only guest, is the following flowsufficient

to unmask an event channel?

    struct shared_info *s = SHARED_INFO();
    int pending = 0;

    atomic_sync_btc(port, &s->evtchn_mask[0]);
    pending = sync_bt(port, &s->evtchn_mask[0]);


You should check for pending interrupts on evtchn_pending, not
evtchn_mask.

Ah, thanks for spotting that! Fixed, but just that change by itself doesnot seem to change the observed behaviour in any way.

    if (pending) {
        /*
         * Slow path:
         *
* If pending is set here, then there was a race, and we lostthe* upcall. Mask the port again and force an upcall via a callto
         * hyperspace.
         *
* This should be sufficient for HVM/PVHv2 based on myunderstanding
of
         * Linux drivers/xen/events/events_2l.c.
         */
        atomic_sync_bts(port, &s->evtchn_mask[0]);
        hypercall_evtchn_unmask(port);
    }


FWIW, I use the hypercall unconditionally on FreeBSD because I didn't
see a performance difference when compared to this method.

Lastly, the old PV-only Mini-OS based stack would do delays ("blockthe

domain") by doing a HYPERVISOR_set_timer_op(deadline) followed by a
HYPERVISOR_sched_op(SCHEDOP_block,0 ). In the new code, I'm doing the
following (based on what Mini-OS seems to be doing for HVM):

    solo5_time_t deadline = Int64_val(v_deadline);

    if (solo5_clock_monotonic() < deadline) {
        hypercall_set_timer_op(deadline);
        __asm__ __volatile__ ("hlt" : : : "memory");
        /* XXX: cancel timer_op here if woken up early? */
    }

Again, is this the right thing to do for PVH?


hlt will trap into the hypervisor, so it's fine to use.


Thanks for confirming.

As the comment says, do I need to cancel the timer_op?


I'm not sure. Keep in mind that a new call to hypercall_set_timer_op
will overwrite the previous timer, and hence should be fine I think as
long as you are using the one-shot timer.

Is there something other than a one-shot timer? hypercall_set_timer_opis just a single-argument hypercall with an uint64_t deadline, right?(And not documented in xen.h either ...)

I understood the
semantics to be "fire once at/after the time deadline is reached", ifthat
is indeed the case then with my current VIRQ_TIMER handler which does
nothing in the interrupt context and has no side effects I should befine.
I have no idea how Solo5 works, maybe you should re-set the timer to
the next deadline in the handler?

Or that's fine because the timer is always set before blocking.

Yes, it's always set before blocking, and we always unblock after thefirst interrupt (i.e. some event) is received, so should be fine.

I can also post the code that does the actual demuxing of events fromXen(i.e. the upcall handler), but I've read through that several timesnow andI don't think the problem is there; adding diagnostic prints to boththe
low-level C event channel code and higher-level OCaml Activations code
confirms that received events are being mapped to their portscorrectly.
I can take a look at the event channel handler if you want, as you
wish. The only weird think I've noticed is the error in the unmask
where you seem to use evtchn_mask instead of evtchn_pending.


Thanks for the offer, this stuff is fairly subtle.

In the Mirage/Xen scenario, there are two parts to the upcall handler.The top half is executed in interrupt context and basically does nothingexcept acknowledge the upcall:


int solo5__xen_evtchn_vector_handler(void *arg __attribute__((unused)))
{
    struct vcpu_info *vi = VCPU0_INFO();

    /*

* All work is done outside of interrupt context byevtchn_demux_pending(),

     * so all we need to do here is acknowledge the upcall from Xen.
     */
    vi->evtchn_upcall_pending = 0;
    return 1;
}

The bottom half is then called periodically (and always before blocking)by the OCaml code:


static bool evtchn_demux_pending(void)
{
    struct shared_info *s = SHARED_INFO();
    struct vcpu_info *vi = VCPU0_INFO();
    bool some_pending = false;

    vi->evtchn_upcall_pending = 0;

    /*
     * Demux events received from Xen.
     *
     * pending_l1 is the "outer" per-VCPU selector (evtchn_pending_sel).
     * pending_l2 is the "inner" system-wide word (evtchn_pending[l1i]).
     */
    xen_ulong_t pending_l1, pending_l2;
    atomic_sync_xchg(&vi->evtchn_pending_sel, 0, &pending_l1);
    while (pending_l1 != 0) {
        xen_ulong_t l1i = ffs(pending_l1);

        /*

* Masking pending_l2 with ~evtchn_mask[l1i] is necessary to getthe

         * *current* masked events; otherwise races like the following
         * can occur:
         *
         *     1. X is generated, upcall scheduled by Xen.
         *     2. X is masked.
         *     3. Upcall is delivered.
         *     4. X fires despite now being masked.
         */
        while ((pending_l2 =

(s->evtchn_pending[l1i] & ~s->evtchn_mask[l1i])) !=0) {

            xen_ulong_t l2i = ffs(pending_l2);

evtchn_port_t port = (l1i * (sizeof(xen_ulong_t) * 8)) +l2i;

            /*
             * Mark as pending on the OCaml side.
             */
            evtchn_callback_ml[port] = 1;
            some_pending = true;

            atomic_sync_btc(l2i, &s->evtchn_pending[l1i]);
        }

        pending_l1 &= ~(1UL << l1i);
    }

    return some_pending;
}


I would also recommend using HVMOP_set_evtchn_upcall_vector instead of
HVM_PARAM_CALLBACK_TYPE_VECTOR. In order to make some tools happy just
set HVM_PARAM_CALLBACK_TYPE_VECTOR to 1. You can take a look at how
Xen does it when running as a guest:

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/guest/xen/xen.c;h=2ff63d370a8a12fef166677e2ded7ed82a628ce8;hb=HEAD#l205

Thanks for the pointer. As I write above, if I can useHVMOP_set_evtchn_upcall_vector w/o doing too much "extra work" on theguest side then I will go with that.


Roger.


-mato

Re: Event delivery and "domain blocking" on PVHv2

Reply via email to