On 2/16/21 8:15 AM, Thomas Huth wrote: > On 16/02/2021 15.40, Halil Pasic wrote: >> On Tue, 16 Feb 2021 12:00:56 +0100 >> Thomas Huth <th...@redhat.com> wrote: >> >>> According to the virtio specification, a memory barrier should be >>> used before incrementing the idx field in the "available" ring. >>> So far, we did not do this in the s390-ccw bios yet, but recently >>> Peter Maydell saw problems with the s390-ccw bios when running >>> the qtests on an aarch64 host (the bios panic'ed with the message: >>> "SCSI cannot report LUNs: response VS RESP=09"), which could >>> maybe be related to the missing memory barriers. Thus let's add >>> those barriers now. Since we've only seen the problem on TCG so far, >>> a "bcr 14,0" should be sufficient here to trigger the tcg_gen_mb() >>> in the TCG translate code. >>> >>> (Note: The virtio spec also talks about using a memory barrier >>> *after* incrementing the idx field, but if I understood correctly >>> this is only required when using notification suppression - which >>> we don't use in the s390-ccw bios here) >> >> I suggest to the barrier after incrementing the idx field for two >> reasons. First: If the device were to see the notification, but >> not see the incremented idx field, it would effectively loose >> initiative. That is pretty straight forward, because the >> notification just says 'check out that queue', and if we don't >> see the incremented index, miss the buffer that was made available >> by incrementing idx. > > I was just about to reply that this is certainly not necessary, since > the DIAGNOSE instruction that we use for the notification hypercall > should be serializing anyway ... but after looking at the PoP, it > actually is not marked as a serializing instruction! (while e.g. > SVC - supervisor call - is explicitly marked as serializing) > > So maybe that's worth a try: Peter, could you please apply this patch > on top an see whether it makes a difference? > > diff --git a/pc-bios/s390-ccw/virtio.c b/pc-bios/s390-ccw/virtio.c > --- a/pc-bios/s390-ccw/virtio.c > +++ b/pc-bios/s390-ccw/virtio.c > @@ -54,6 +54,7 @@ static long kvm_hypercall(unsigned long nr, unsigned long > param1, > register ulong r_param3 asm("4") = param3; > register long retval asm("2"); > > + virtio_mb(); > asm volatile ("diag 2,4,0x500" > : "=d" (retval) > : "d" (r_nr), "0" (r_param1), "r"(r_param2), "d"(r_param3)
This is patching the wrong thing, IMO. You're adding barriers to the guest that I think ought not be architecturally required -- memory accesses on s390x are strongly ordered, with or without diag/svc/whatever. With diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index 1376cdc404..3c5f38be62 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1622,6 +1622,8 @@ static void tcg_out_tlb_read TCGType mask_type; uint64_t compare_mask; + tcg_out_mb(s, TCG_MO_ALL); + mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32 ? TCG_TYPE_I64 : TCG_TYPE_I32); which is a gigantic hammer, adding a host barrier before every qemu guest access, I can no longer provoke a failure (previously visible 1 in 4, now no failures in 100). With that as a data point for success, I'm going to try to use host load-acquire / store-release instructions, and then apply TCG_GUEST_DEFAULT_MO and see if I can find something that works reasonably. r~