Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support

2024-10-28 Thread Akihiko Odaki

On 2024/10/26 19:24, Phil Dennis-Jordan wrote:



On Sat, 26 Oct 2024 at 06:40, Akihiko Odaki > wrote:


On 2024/10/26 4:43, Phil Dennis-Jordan wrote:
 >
 >
 > On Fri, 25 Oct 2024 at 08:03, Akihiko Odaki
mailto:akihiko.od...@daynix.com>
 > >> wrote:
 >
 >     On 2024/10/24 19:28, Phil Dennis-Jordan wrote:
 >      > +    /* For running PVG memory-mapping requests in the AIO
context */
 >      > +    QemuCond job_cond;
 >      > +    QemuMutex job_mutex;
 >
 >     Use: QemuEvent
 >
 >
 > Hmm. I think if we were to use that, we would need to create a new
 > QemuEvent for every job and destroy it afterward, which seems
expensive.
 > We can't rule out multiple concurrent jobs being submitted, and the
 > QemuEvent system only supports a single producer as far as I can
tell.
 >
 > You can probably sort of hack around it with just one QemuEvent by
 > putting the qemu_event_wait into a loop and turning the job.done
flag
 > into an atomic (because it would now need to be checked outside the
 > lock) but this all seems unnecessarily complicated considering the
 > QemuEvent uses the same mechanism QemuCond/QemuMutex internally
on macOS
 > (the only platform relevant here), except we can use it as
intended with
 > QemuCond/QemuMutex rather than having to work against the
abstraction.

I don't think it's going to be used concurrently. It would be difficult
to reason even for the framework if it performs memory
unmapping/mapping/reading operations concurrently.


I've just performed a very quick test by wrapping the job submission/ 
wait in the 2 mapMemory callbacks and the 1 readMemory callback with 
atomic counters and logging whenever a counter went above 1.


  * Overall, concurrent callbacks across all types were common (many per 
second when the VM is busy). It's not exactly a "thundering herd" (I 
never saw >2) but it's probably not a bad idea to use a separate 
condition variable for each job type. (task map, surface map, memory read)
  * While I did not observe any concurrent memory mapping operations 
*within* a type of memory map (2 task mappings or 2 surface mappings) I 
did see very occasional concurrent memory *read* callbacks. These would, 
as far as I can tell, not be safe with QemuEvents, unless we placed the 
event inside the job struct and init/destroyed it on every callback 
(which seems like excessive overhead).


I think we can tolerate that overhead. init/destroy essentially sets the 
fields in the data structure and I estimate its total size is about 100 
bytes. It is probably better than waking an irrelevant thread up. I also 
hope that keeps the code simple; it's not worthwhile adding code to 
optimize this.




My recommendation would be to split it up into 3 pairs of mutex/cond; 
this will almost entirely remove any contention, but continue to be safe 
for when it does occur. I don't think QemuEvent is a realistic option 
(too tricky to get right) for the observed-concurrent readMemory 
callback. I'm nervous about assuming the mapMemory callbacks will NEVER 
be called concurrently, but at a push I'll acquiesce to switching those 
to QemuEvent in the absence of evidence of concurrency.> > PGDevice.h also notes

raiseInterrupt needs to be thread-safe while it doesn't make such notes
for memory operations. This actually makes sense.

If it's ever going to be used concurrently, it's better to have
QemuEvent for each job to avoid the thundering herd problem.

> >  >

 >      > +
 >      > +    dispatch_queue_t render_queue;
 >      > +    /* The following fields should only be accessed from
the BQL: */
 >
 >     Perhaps it may be better to document fields that can be accessed
 >     *without* the BQL; most things in QEMU implicitly require the
BQL.
 >
 >      > +    bool gfx_update_requested;
 >      > +    bool new_frame_ready;
 >      > +    bool using_managed_texture_storage;
 >      > +} AppleGFXState;
 >      > +
 >      > +void apple_gfx_common_init(Object *obj, AppleGFXState *s,
const
 >     char* obj_name);
 >      > +void apple_gfx_common_realize(AppleGFXState *s,
 >     PGDeviceDescriptor *desc,
 >      > +                              Error **errp);
 >      > +uintptr_t apple_gfx_host_address_for_gpa_range(uint64_t
 >     guest_physical,
 >      > +                                               uint64_t
length,
 >     bool read_only);
 >      > +void apple_gfx_await_bh_job(AppleGFXState *s, bool
*job_done_flag);
 >      > +
 >      > +#endif
 >      > +
 >      > diff --git a/hw/display/apple-gfx.m b/hw/display/apple-gfx.m
 >      > new file mode 100644
 >      > index 000..46be9

Re: [PATCH 0/2] arm: Add collie and sx functional tests

2024-10-28 Thread Jan Lübbe
On Sun, 2024-10-27 at 20:32 -0700, Guenter Roeck wrote:
> On 10/27/24 15:26, Cédric Le Goater wrote:
> > On 10/27/24 23:11, Guenter Roeck wrote:
> > > On 10/27/24 14:13, Cédric Le Goater wrote:
> > > > On 10/26/24 17:32, Guenter Roeck wrote:
> > > > > On 10/26/24 03:02, Cédric Le Goater wrote:
> > > > > [ ... ]
> > > > > 
> > > > > > 
> > > > > Works for me, though, and it is much better than mandating the 
> > > > > existence
> > > > > of boot partitions.
> > > > 
> > > > Yes. However, if the emmc device was user creatable, we could use :
> > > > 
> > > >    -blockdev 
> > > > node-name=emmc0,driver=file,filename=mmc-ast2600-evb-noboot.raw \
> > > >    -device emmc,bus=sdhci-bus.2,drive=emmc0
> > > > 
> > > > and with boot partitions:
> > > > 
> > > >    -M boot-emmc=true \
> > > >    -blockdev node-name=emmc0,driver=file,filename=mmc-ast2600-evb.raw \
> > > >    -device 
> > > > emmc,bus=sdhci-bus.2,drive=emmc0,boot-partition-size=1048576,boot-config=8
> > > > 
> > > > The above would be my preferred approach if acceptable. The "sd-bus"
> > > > bus identifier should be changed in other machines tough.
> > > 
> > > No real preference here, though my understanding is that emmc devices
> > > are by definition built-in, and that is what emmc_class_init() says as 
> > > well.
> > > Also, there does not seem to be an sdhci-bus, only sd-bus, and that does
> > > not support any index values. That may be just my lack of knowledge, 
> > > though.
> > 
> > No, you are right. On a real ast2600-evb, the eMMC device is indeed
> > soldered on the board. But, for testing purposes, it is sometime
> > interesting to add some flexibility in the machine definition and
> > in the modeling too. This avoids "hard-coding" default devices in
> > the machines and lets the user define its own variant models using
> > the QEMU command line.
> 
> I would agree, but I had a number of my patches rejected because while
> they would be useful for testing they would not accurately reflect the
> hardware. So nowadays I gave up even trying to upstream such changes.

My patch to make eMMCs user creatable [1] was applied to target-
arm.next by Peter Maydell [2] last week.

Jan

[1] 
https://lore.kernel.org/qemu-devel/20241015135649.4189256-1-...@pengutronix.de/
[2] 
https://lore.kernel.org/qemu-devel/CAFEAcA9sjszCj=Fu-A-=qQV_jawnomJ-Nqnd=vx2vlkmyz1...@mail.gmail.com/

-- 
Pengutronix e.K.| |
Steuerwalder Str. 21| https://www.pengutronix.de/ |
31137 Hildesheim, Germany   | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917- |





Re: [PATCH] hw/nvme: fix handling of over-committed queues

2024-10-28 Thread Klaus Jensen
On Oct 25 10:45, Keith Busch wrote:
> On Fri, Oct 25, 2024 at 12:50:45PM +0200, Klaus Jensen wrote:
> > @@ -1520,9 +1520,16 @@ static void nvme_post_cqes(void *opaque)
> >  nvme_inc_cq_tail(cq);
> >  nvme_sg_unmap(&req->sg);
> > +
> > +if (QTAILQ_EMPTY(&sq->req_list) && !nvme_sq_empty(sq)) {
> > +qemu_bh_schedule(sq->bh);
> > +}
> > +
> >  QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
> >  }
> 
> Shouldn't we schedule the bottom half after the req has been added to
> the list? I think everything the callback needs to be written prior to
> calling qemu_bh_schedule().
> 

Not as far as I know. It is only queued up; it won't be executed
immediately. It might run next (ASAP) if we are already in a bottom
half, but not before whatever context we are in returns.


signature.asc
Description: PGP signature


[PATCH] hw/sd/sdcard: Fix calculation of size when using eMMC boot partitions

2024-10-28 Thread Jan Luebbe
The sd_bootpart_offset() function calculates the *runtime* offset which
changes as the guest switches between accessing the main user data area
and the boot partitions by writing to the EXT_CSD_PART_CONFIG_ACC_MASK
bits, so it shouldn't be used to calculate the main user data area size.

Instead, subtract the boot_part_size directly (twice, as there are two
identical boot partitions defined by the eMMC spec).

Suggested-by: Cédric Le Goater 
Signed-off-by: Jan Luebbe 
---
 hw/sd/sd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 2d3467c3d956..8430d5ae361c 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -826,7 +826,9 @@ static void sd_reset(DeviceState *dev)
 sect = 0;
 }
 size = sect << HWBLOCK_SHIFT;
-size -= sd_bootpart_offset(sd);
+if (sd_is_emmc(sd)) {
+size -= sd->boot_part_size * 2;
+}
 
 sect = sd_addr_to_wpnum(size) + 1;
 
-- 
2.39.5




Re: [PATCH] hw/nvme: fix handling of over-committed queues

2024-10-28 Thread Keith Busch
On Mon, Oct 28, 2024 at 10:01:50AM +0100, Klaus Jensen wrote:
> On Oct 25 10:45, Keith Busch wrote:
> > On Fri, Oct 25, 2024 at 12:50:45PM +0200, Klaus Jensen wrote:
> > > @@ -1520,9 +1520,16 @@ static void nvme_post_cqes(void *opaque)
> > >  nvme_inc_cq_tail(cq);
> > >  nvme_sg_unmap(&req->sg);
> > > +
> > > +if (QTAILQ_EMPTY(&sq->req_list) && !nvme_sq_empty(sq)) {
> > > +qemu_bh_schedule(sq->bh);
> > > +}
> > > +
> > >  QTAILQ_INSERT_TAIL(&sq->req_list, req, entry);
> > >  }
> > 
> > Shouldn't we schedule the bottom half after the req has been added to
> > the list? I think everything the callback needs to be written prior to
> > calling qemu_bh_schedule().
> > 
> 
> Not as far as I know. It is only queued up; it won't be executed
> immediately. It might run next (ASAP) if we are already in a bottom
> half, but not before whatever context we are in returns.

Okay. I was trying to come up with an explanation for why Waldek was
still able to reproduce the problem, and that was all I have so far.



Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support

2024-10-28 Thread Akihiko Odaki

On 2024/10/28 22:31, Phil Dennis-Jordan wrote:



On Mon, 28 Oct 2024 at 10:00, Phil Dennis-Jordan > wrote:



 >      >
 >      > Hmm. I think if we were to use that, we would need to
create a new
 >      > QemuEvent for every job and destroy it afterward,
which seems
 >     expensive.
 >      > We can't rule out multiple concurrent jobs being
submitted, and the
 >      > QemuEvent system only supports a single producer as
far as I can
 >     tell.
 >      >
 >      > You can probably sort of hack around it with just one
QemuEvent by
 >      > putting the qemu_event_wait into a loop and turning
the job.done
 >     flag
 >      > into an atomic (because it would now need to be
checked outside the
 >      > lock) but this all seems unnecessarily complicated
considering the
 >      > QemuEvent uses the same mechanism QemuCond/QemuMutex
internally
 >     on macOS
 >      > (the only platform relevant here), except we can use it as
 >     intended with
 >      > QemuCond/QemuMutex rather than having to work against the
 >     abstraction.
 >
 >     I don't think it's going to be used concurrently. It
would be difficult
 >     to reason even for the framework if it performs memory
 >     unmapping/mapping/reading operations concurrently.
 >
 >
 > I've just performed a very quick test by wrapping the job
submission/
 > wait in the 2 mapMemory callbacks and the 1 readMemory
callback with
 > atomic counters and logging whenever a counter went above 1.
 >
 >   * Overall, concurrent callbacks across all types were
common (many per
 > second when the VM is busy). It's not exactly a "thundering
herd" (I
 > never saw >2) but it's probably not a bad idea to use a separate
 > condition variable for each job type. (task map, surface map,
memory read)
 >   * While I did not observe any concurrent memory mapping
operations
 > *within* a type of memory map (2 task mappings or 2 surface
mappings) I
 > did see very occasional concurrent memory *read* callbacks.
These would,
 > as far as I can tell, not be safe with QemuEvents, unless we
placed the
 > event inside the job struct and init/destroyed it on every
callback
 > (which seems like excessive overhead).

I think we can tolerate that overhead. init/destroy essentially
sets the
fields in the data structure and I estimate its total size is
about 100
bytes. It is probably better than waking an irrelevant thread
up. I also
hope that keeps the code simple; it's not worthwhile adding code to
optimize this.


At least pthread_cond_{init,destroy} and
pthread_mutex_{init,destroy} don't make any syscalls, so yeah it's
probably an acceptable overhead.


I've just experimented with QemuEvents created on-demand and ran into 
some weird deadlocks, which then made me sit down and think about it 
some more. I've come to the conclusion that creating (and crucially, 
destroying) QemuEvents on demand in this way is not safe.


Specifically, you must not call qemu_event_destroy() - which 
transitively destroys the mutex and condition variable - unless you can 
guarantee that the qemu_event_set() call on that event object has completed.


In qemu_event_set, the event object's value is atomically set to EV_SET. 
If the previous value was EV_BUSY, qemu_futex_wake() is called. All of 
this is outside any mutex, however, so apart from memory coherence 
(there are barriers) this can race with the waiting thread. 
qemu_event_wait() reads the event's value. If EV_FREE, it's atomically 
set to EV_BUSY. Then the mutex is locked, the value is checked again, 
and if it's still EV_BUSY, it waits for the condition variable, 
otherwise the mutex is immediately unlocked again. If the trigger 
thread's qemu_event_set() flip to EV_SET occurs between the waiting 
thread's two atomic reads of the value, the waiting thread will never 
wait for the condition variable, but the trigger thread WILL try to 
acquire the mutex and signal the condition variable in 
qemu_futex_wake(), by which  time the waiting thread may have advanced 
outside of qemu_event_wait().


Sorry if I'm making a mistake again, but the waiting thread won't set to 
EV_BUSY unless the value is EV_FREE on the second read so the trigger 
thread will not call qemu_futex_wake() if it manages to set to EV_SET 
before the second read, will it?




This is all fine usually, BUT if you destroy the QemuEvent immediately 
after the qemu_event_wait() call, qemu_futex_wake() may try to lock a 

Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support

2024-10-28 Thread Phil Dennis-Jordan
On Mon, 28 Oct 2024 at 15:02, Akihiko Odaki 
wrote:

> On 2024/10/28 22:31, Phil Dennis-Jordan wrote:
> >
> >
> > On Mon, 28 Oct 2024 at 10:00, Phil Dennis-Jordan  > > wrote:
> >
> >
> >  >  >
> >  >  > Hmm. I think if we were to use that, we would need to
> > create a new
> >  >  > QemuEvent for every job and destroy it afterward,
> > which seems
> >  > expensive.
> >  >  > We can't rule out multiple concurrent jobs being
> > submitted, and the
> >  >  > QemuEvent system only supports a single producer as
> > far as I can
> >  > tell.
> >  >  >
> >  >  > You can probably sort of hack around it with just one
> > QemuEvent by
> >  >  > putting the qemu_event_wait into a loop and turning
> > the job.done
> >  > flag
> >  >  > into an atomic (because it would now need to be
> > checked outside the
> >  >  > lock) but this all seems unnecessarily complicated
> > considering the
> >  >  > QemuEvent uses the same mechanism QemuCond/QemuMutex
> > internally
> >  > on macOS
> >  >  > (the only platform relevant here), except we can use
> it as
> >  > intended with
> >  >  > QemuCond/QemuMutex rather than having to work against
> the
> >  > abstraction.
> >  >
> >  > I don't think it's going to be used concurrently. It
> > would be difficult
> >  > to reason even for the framework if it performs memory
> >  > unmapping/mapping/reading operations concurrently.
> >  >
> >  >
> >  > I've just performed a very quick test by wrapping the job
> > submission/
> >  > wait in the 2 mapMemory callbacks and the 1 readMemory
> > callback with
> >  > atomic counters and logging whenever a counter went above 1.
> >  >
> >  >   * Overall, concurrent callbacks across all types were
> > common (many per
> >  > second when the VM is busy). It's not exactly a "thundering
> > herd" (I
> >  > never saw >2) but it's probably not a bad idea to use a
> separate
> >  > condition variable for each job type. (task map, surface map,
> > memory read)
> >  >   * While I did not observe any concurrent memory mapping
> > operations
> >  > *within* a type of memory map (2 task mappings or 2 surface
> > mappings) I
> >  > did see very occasional concurrent memory *read* callbacks.
> > These would,
> >  > as far as I can tell, not be safe with QemuEvents, unless we
> > placed the
> >  > event inside the job struct and init/destroyed it on every
> > callback
> >  > (which seems like excessive overhead).
> >
> > I think we can tolerate that overhead. init/destroy essentially
> > sets the
> > fields in the data structure and I estimate its total size is
> > about 100
> > bytes. It is probably better than waking an irrelevant thread
> > up. I also
> > hope that keeps the code simple; it's not worthwhile adding code
> to
> > optimize this.
> >
> >
> > At least pthread_cond_{init,destroy} and
> > pthread_mutex_{init,destroy} don't make any syscalls, so yeah it's
> > probably an acceptable overhead.
> >
> >
> > I've just experimented with QemuEvents created on-demand and ran into
> > some weird deadlocks, which then made me sit down and think about it
> > some more. I've come to the conclusion that creating (and crucially,
> > destroying) QemuEvents on demand in this way is not safe.
> >
> > Specifically, you must not call qemu_event_destroy() - which
> > transitively destroys the mutex and condition variable - unless you can
> > guarantee that the qemu_event_set() call on that event object has
> completed.
> >
> > In qemu_event_set, the event object's value is atomically set to EV_SET.
> > If the previous value was EV_BUSY, qemu_futex_wake() is called. All of
> > this is outside any mutex, however, so apart from memory coherence
> > (there are barriers) this can race with the waiting thread.
> > qemu_event_wait() reads the event's value. If EV_FREE, it's atomically
> > set to EV_BUSY. Then the mutex is locked, the value is checked again,
> > and if it's still EV_BUSY, it waits for the condition variable,
> > otherwise the mutex is immediately unlocked again. If the trigger
> > thread's qemu_event_set() flip to EV_SET occurs between the waiting
> > thread's two atomic reads of the value, the waiting thread will never
> > wait for the condition variable, but the trigger thread WILL try to
> > acquire the mutex and signal the condition variable in
> > qemu_futex_wake(), by which  time the waiting thread may 

Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support

2024-10-28 Thread Phil Dennis-Jordan
On Mon, 28 Oct 2024 at 10:00, Phil Dennis-Jordan  wrote:

>
> >  >
>> >  > Hmm. I think if we were to use that, we would need to create a
>> new
>> >  > QemuEvent for every job and destroy it afterward, which seems
>> > expensive.
>> >  > We can't rule out multiple concurrent jobs being submitted, and
>> the
>> >  > QemuEvent system only supports a single producer as far as I can
>> > tell.
>> >  >
>> >  > You can probably sort of hack around it with just one QemuEvent
>> by
>> >  > putting the qemu_event_wait into a loop and turning the job.done
>> > flag
>> >  > into an atomic (because it would now need to be checked outside
>> the
>> >  > lock) but this all seems unnecessarily complicated considering
>> the
>> >  > QemuEvent uses the same mechanism QemuCond/QemuMutex internally
>> > on macOS
>> >  > (the only platform relevant here), except we can use it as
>> > intended with
>> >  > QemuCond/QemuMutex rather than having to work against the
>> > abstraction.
>> >
>> > I don't think it's going to be used concurrently. It would be
>> difficult
>> > to reason even for the framework if it performs memory
>> > unmapping/mapping/reading operations concurrently.
>> >
>> >
>> > I've just performed a very quick test by wrapping the job submission/
>> > wait in the 2 mapMemory callbacks and the 1 readMemory callback with
>> > atomic counters and logging whenever a counter went above 1.
>> >
>> >   * Overall, concurrent callbacks across all types were common (many
>> per
>> > second when the VM is busy). It's not exactly a "thundering herd" (I
>> > never saw >2) but it's probably not a bad idea to use a separate
>> > condition variable for each job type. (task map, surface map, memory
>> read)
>> >   * While I did not observe any concurrent memory mapping operations
>> > *within* a type of memory map (2 task mappings or 2 surface mappings) I
>> > did see very occasional concurrent memory *read* callbacks. These
>> would,
>> > as far as I can tell, not be safe with QemuEvents, unless we placed the
>> > event inside the job struct and init/destroyed it on every callback
>> > (which seems like excessive overhead).
>>
>> I think we can tolerate that overhead. init/destroy essentially sets the
>> fields in the data structure and I estimate its total size is about 100
>> bytes. It is probably better than waking an irrelevant thread up. I also
>> hope that keeps the code simple; it's not worthwhile adding code to
>> optimize this.
>>
>
> At least pthread_cond_{init,destroy} and pthread_mutex_{init,destroy}
> don't make any syscalls, so yeah it's probably an acceptable overhead.
>

I've just experimented with QemuEvents created on-demand and ran into some
weird deadlocks, which then made me sit down and think about it some more.
I've come to the conclusion that creating (and crucially, destroying)
QemuEvents on demand in this way is not safe.

Specifically, you must not call qemu_event_destroy() - which transitively
destroys the mutex and condition variable - unless you can guarantee that
the qemu_event_set() call on that event object has completed.

In qemu_event_set, the event object's value is atomically set to EV_SET. If
the previous value was EV_BUSY, qemu_futex_wake() is called. All of this is
outside any mutex, however, so apart from memory coherence (there are
barriers) this can race with the waiting thread. qemu_event_wait() reads
the event's value. If EV_FREE, it's atomically set to EV_BUSY. Then the
mutex is locked, the value is checked again, and if it's still EV_BUSY, it
waits for the condition variable, otherwise the mutex is immediately
unlocked again. If the trigger thread's qemu_event_set() flip to EV_SET
occurs between the waiting thread's two atomic reads of the value, the
waiting thread will never wait for the condition variable, but the trigger
thread WILL try to acquire the mutex and signal the condition variable in
qemu_futex_wake(), by which  time the waiting thread may have advanced
outside of qemu_event_wait().

This is all fine usually, BUT if you destroy the QemuEvent immediately
after the qemu_event_wait() call, qemu_futex_wake() may try to lock a mutex
that has been destroyed, or signal a condition variable which has been
destroyed. I don't see a reasonable way of making this safe other than
using long-lived mutexes and condition variables. And anyway, we have much,
MUCH bigger contention/performance issues coming from almost everything
being covered by the BQL. (If waking these callbacks can even be considered
an issue: I haven't seen it show up in profiling, whereas BQL contention
very much does.)

I'll submit v5 of this patch set with separate condition variables for each
job type. This should make the occurrence of waking the wrong thread quite
rare, while reasoning about correctness is pretty straightforward. I think
that's good enough.


Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support

2024-10-28 Thread Akihiko Odaki

On 2024/10/28 23:13, Phil Dennis-Jordan wrote:



On Mon, 28 Oct 2024 at 15:02, Akihiko Odaki > wrote:


On 2024/10/28 22:31, Phil Dennis-Jordan wrote:
 >
 >
 > On Mon, 28 Oct 2024 at 10:00, Phil Dennis-Jordan
mailto:p...@philjordan.eu>
 > >> wrote:
 >
 >
 >          >      >
 >          >      > Hmm. I think if we were to use that, we would
need to
 >         create a new
 >          >      > QemuEvent for every job and destroy it afterward,
 >         which seems
 >          >     expensive.
 >          >      > We can't rule out multiple concurrent jobs being
 >         submitted, and the
 >          >      > QemuEvent system only supports a single producer as
 >         far as I can
 >          >     tell.
 >          >      >
 >          >      > You can probably sort of hack around it with
just one
 >         QemuEvent by
 >          >      > putting the qemu_event_wait into a loop and turning
 >         the job.done
 >          >     flag
 >          >      > into an atomic (because it would now need to be
 >         checked outside the
 >          >      > lock) but this all seems unnecessarily complicated
 >         considering the
 >          >      > QemuEvent uses the same mechanism QemuCond/
QemuMutex
 >         internally
 >          >     on macOS
 >          >      > (the only platform relevant here), except we
can use it as
 >          >     intended with
 >          >      > QemuCond/QemuMutex rather than having to work
against the
 >          >     abstraction.
 >          >
 >          >     I don't think it's going to be used concurrently. It
 >         would be difficult
 >          >     to reason even for the framework if it performs memory
 >          >     unmapping/mapping/reading operations concurrently.
 >          >
 >          >
 >          > I've just performed a very quick test by wrapping the job
 >         submission/
 >          > wait in the 2 mapMemory callbacks and the 1 readMemory
 >         callback with
 >          > atomic counters and logging whenever a counter went
above 1.
 >          >
 >          >   * Overall, concurrent callbacks across all types were
 >         common (many per
 >          > second when the VM is busy). It's not exactly a
"thundering
 >         herd" (I
 >          > never saw >2) but it's probably not a bad idea to use
a separate
 >          > condition variable for each job type. (task map,
surface map,
 >         memory read)
 >          >   * While I did not observe any concurrent memory mapping
 >         operations
 >          > *within* a type of memory map (2 task mappings or 2
surface
 >         mappings) I
 >          > did see very occasional concurrent memory *read*
callbacks.
 >         These would,
 >          > as far as I can tell, not be safe with QemuEvents,
unless we
 >         placed the
 >          > event inside the job struct and init/destroyed it on every
 >         callback
 >          > (which seems like excessive overhead).
 >
 >         I think we can tolerate that overhead. init/destroy
essentially
 >         sets the
 >         fields in the data structure and I estimate its total size is
 >         about 100
 >         bytes. It is probably better than waking an irrelevant thread
 >         up. I also
 >         hope that keeps the code simple; it's not worthwhile
adding code to
 >         optimize this.
 >
 >
 >     At least pthread_cond_{init,destroy} and
 >     pthread_mutex_{init,destroy} don't make any syscalls, so yeah
it's
 >     probably an acceptable overhead.
 >
 >
 > I've just experimented with QemuEvents created on-demand and ran
into
 > some weird deadlocks, which then made me sit down and think about it
 > some more. I've come to the conclusion that creating (and crucially,
 > destroying) QemuEvents on demand in this way is not safe.
 >
 > Specifically, you must not call qemu_event_destroy() - which
 > transitively destroys the mutex and condition variable - unless
you can
 > guarantee that the qemu_event_set() call on that event object has
completed.
 >
 > In qemu_event_set, the event object's value is atomically set to
EV_SET.
 > If the previous value was EV_BUSY, qemu_futex_wake() is called.
All of
 > this is outside any mutex, however, so apart from memory coherence
 > (there are barriers) this can race with the waiting thread.
 > qemu_event_wait() reads the event's value. If EV_FREE, it's
atomically
 > set to EV_BUSY. Then the mutex is l

Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support

2024-10-28 Thread Phil Dennis-Jordan
On Mon, 28 Oct 2024 at 17:06, Akihiko Odaki 
wrote:

> On 2024/10/28 23:13, Phil Dennis-Jordan wrote:
> >
> >
> > On Mon, 28 Oct 2024 at 15:02, Akihiko Odaki  > > wrote:
> >
> > On 2024/10/28 22:31, Phil Dennis-Jordan wrote:
> >  >
> >  >
> >  > On Mon, 28 Oct 2024 at 10:00, Phil Dennis-Jordan
> > mailto:p...@philjordan.eu>
> >  > >> wrote:
> >  >
> >  >
> >  >  >  >
> >  >  >  > Hmm. I think if we were to use that, we would
> > need to
> >  > create a new
> >  >  >  > QemuEvent for every job and destroy it
> afterward,
> >  > which seems
> >  >  > expensive.
> >  >  >  > We can't rule out multiple concurrent jobs being
> >  > submitted, and the
> >  >  >  > QemuEvent system only supports a single
> producer as
> >  > far as I can
> >  >  > tell.
> >  >  >  >
> >  >  >  > You can probably sort of hack around it with
> > just one
> >  > QemuEvent by
> >  >  >  > putting the qemu_event_wait into a loop and
> turning
> >  > the job.done
> >  >  > flag
> >  >  >  > into an atomic (because it would now need to be
> >  > checked outside the
> >  >  >  > lock) but this all seems unnecessarily
> complicated
> >  > considering the
> >  >  >  > QemuEvent uses the same mechanism QemuCond/
> > QemuMutex
> >  > internally
> >  >  > on macOS
> >  >  >  > (the only platform relevant here), except we
> > can use it as
> >  >  > intended with
> >  >  >  > QemuCond/QemuMutex rather than having to work
> > against the
> >  >  > abstraction.
> >  >  >
> >  >  > I don't think it's going to be used concurrently.
> It
> >  > would be difficult
> >  >  > to reason even for the framework if it performs
> memory
> >  >  > unmapping/mapping/reading operations concurrently.
> >  >  >
> >  >  >
> >  >  > I've just performed a very quick test by wrapping the
> job
> >  > submission/
> >  >  > wait in the 2 mapMemory callbacks and the 1 readMemory
> >  > callback with
> >  >  > atomic counters and logging whenever a counter went
> > above 1.
> >  >  >
> >  >  >   * Overall, concurrent callbacks across all types were
> >  > common (many per
> >  >  > second when the VM is busy). It's not exactly a
> > "thundering
> >  > herd" (I
> >  >  > never saw >2) but it's probably not a bad idea to use
> > a separate
> >  >  > condition variable for each job type. (task map,
> > surface map,
> >  > memory read)
> >  >  >   * While I did not observe any concurrent memory
> mapping
> >  > operations
> >  >  > *within* a type of memory map (2 task mappings or 2
> > surface
> >  > mappings) I
> >  >  > did see very occasional concurrent memory *read*
> > callbacks.
> >  > These would,
> >  >  > as far as I can tell, not be safe with QemuEvents,
> > unless we
> >  > placed the
> >  >  > event inside the job struct and init/destroyed it on
> every
> >  > callback
> >  >  > (which seems like excessive overhead).
> >  >
> >  > I think we can tolerate that overhead. init/destroy
> > essentially
> >  > sets the
> >  > fields in the data structure and I estimate its total
> size is
> >  > about 100
> >  > bytes. It is probably better than waking an irrelevant
> thread
> >  > up. I also
> >  > hope that keeps the code simple; it's not worthwhile
> > adding code to
> >  > optimize this.
> >  >
> >  >
> >  > At least pthread_cond_{init,destroy} and
> >  > pthread_mutex_{init,destroy} don't make any syscalls, so yeah
> > it's
> >  > probably an acceptable overhead.
> >  >
> >  >
> >  > I've just experimented with QemuEvents created on-demand and ran
> > into
> >  > some weird deadlocks, which then made me sit down and think about
> it
> >  > some more. I've come to the conclusion that creating (and
> crucially,
> >  > destroying) QemuEvents on demand in this way is not safe.
> >  >
> >  > Specifically, you must not call qemu_event_destroy() - which
> >  > transitively destroys the mutex and condition variable - unless
> > you can
> >