Re: [PATCH] pc-bios/s390-ccw: Use memory barriers in virtio code

Peter Maydell Wed, 17 Feb 2021 03:16:34 -0800

On Wed, 17 Feb 2021 at 04:31, Richard Henderson
<richard.hender...@linaro.org> wrote:
> On 2/16/21 8:15 AM, Thomas Huth wrote:
> With
>
> diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
> index 1376cdc404..3c5f38be62 100644
> --- a/tcg/aarch64/tcg-target.c.inc
> +++ b/tcg/aarch64/tcg-target.c.inc
> @@ -1622,6 +1622,8 @@ static void tcg_out_tlb_read
>      TCGType mask_type;
>      uint64_t compare_mask;
>
> +    tcg_out_mb(s, TCG_MO_ALL);
> +
>      mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
>                   ? TCG_TYPE_I64 : TCG_TYPE_I32);
>
> which is a gigantic hammer, adding a host barrier before every qemu guest
> access, I can no longer provoke a failure (previously visible 1 in 4, now no
> failures in 100).
>
> With that as a data point for success, I'm going to try to use host
> load-acquire / store-release instructions, and then apply TCG_GUEST_DEFAULT_MO
> and see if I can find something that works reasonably.


This isn't aarch64-host-specific, though, is it? It's going to be
the situation for any host with a relaxed memory model. Do we really
want to make all loads and stores lower-performance by adding in
the ldacq/strel (or worse, barriers everywhere on host archs without
ldacq/strel)? I feel like there ought to be an alternate approach
involving using some kind of exclusion to ensure that we don't run
the iothreads in parallel with the vCPU thread if we're using the
non-MTTCG setup where all the vCPUs are on a single thread, and that
that's probably less of a perf hit.

thanks
-- PMM

Re: [PATCH] pc-bios/s390-ccw: Use memory barriers in virtio code

Reply via email to