On 2/17/21 3:15 AM, Peter Maydell wrote: > This isn't aarch64-host-specific, though, is it? It's going to be > the situation for any host with a relaxed memory model.
Yes. I intend to make the code-generation changes generic. > Do we really > want to make all loads and stores lower-performance by adding in > the ldacq/strel (or worse, barriers everywhere on host archs without > ldacq/strel)? Well, yes. But then we get to enable mttcg too. > I feel like there ought to be an alternate approach > involving using some kind of exclusion to ensure that we don't run > the iothreads in parallel with the vCPU thread if we're using the > non-MTTCG setup where all the vCPUs are on a single thread, and that > that's probably less of a perf hit. I don't know where to put such a block, do you? The memory barriers are a perf hit with -smp 1, but I would think that all that and more are recoverable by not having to run -smp 2 serially. r~