RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch

Salil Mehta via Wed, 04 Sep 2024 09:04:20 -0700

Hi Alex,

>  From: Alex Bennée <alex.ben...@linaro.org>
>  Sent: Wednesday, September 4, 2024 4:46 PM
>  To: Salil Mehta <salil.me...@huawei.com>
>  
>  Salil Mehta <salil.me...@huawei.com> writes:
>  
>  > Hi Alex,
>  >
>  >>  -----Original Message-----
>  >>  From: Alex Bennée <alex.ben...@linaro.org>
>  >>  Sent: Thursday, August 29, 2024 11:00 AM
>  >>  To: Gustavo Romero <gustavo.rom...@linaro.org>
>  >>
>  >>  Gustavo Romero <gustavo.rom...@linaro.org> writes:
>  >>
>  >>  > Hi Salil,
>  >>  >
>  >>  > On 6/13/24 8:36 PM, Salil Mehta via wrote:
>  >>  <snip>
>  >>  >> (VI) Commands Used
>  >>  >> ==================
>  >>  >> A. Qemu launch commands to init the machine:
>  >>  >>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3
>  \
>  >>  >>        -cpu host -smp cpus=4,maxcpus=6 \
>  >>  >>        -m 300M \
>  >>  >>        -kernel Image \
>  >>  >>        -initrd rootfs.cpio.gz \
>  >>  >>        -append "console=ttyAMA0 root=/dev/ram rdinit=/init
>  maxcpus=2
>  >>  acpi=force" \
>  >>  >>        -nographic \
>  >>  >>        -bios QEMU_EFI.fd \
>  >>  >> B. Hot-(un)plug related commands:
>  >>  >>    # Hotplug a host vCPU (accel=kvm):
>  >>  >>      $ device_add host-arm-cpu,id=core4,core-id=4
>  >>  >>    # Hotplug a vCPU (accel=tcg):
>  >>  >>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>  >>  >
>  >>  > Since support for hotplug is disabled on TCG, remove these two
>  >> lines  > in v4 cover letter?
>  >>
>  >>  Why is it disabled for TCG? We should aim for TCG being as close to
>  >> KVM as  possible for developers even if it is not a production solution.
>  >
>  > Agreed In principle. Yes, that would be of help.
>  >
>  >
>  > Context why it was disabled although most code to support TCG exist:
>  >
>  > I had reported a crash in the RFC V1 (June 2020) about TCGContext
>  > counter overflow assertion during repeated hot(un)plug operation.
>  > Miguel from Oracle was able to reproduce this problem last year in Feb
>  > and also suggested a fix but he later found out in his testing that there 
> was
>  a problem during migration.
>  >
>  > RFC V1 June 2020:
>  > https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
>  salil.mehta@
>  > huawei.com/
>  > Scroll to below:
>  > [...]
>  > THINGS TO DO:
>  >  (*) Migration support
>  >  (*) TCG/Emulation support is not proper right now. Works to a certain
>  extent
>  >      but is not complete. especially the unrealize part in which there is a
>  >      overflow of tcg contexts. The last is due to the fact tcg maintains a
>  >      count on number of context(per thread instance) so as we hotplug the
>  vcpus
>  >      this counter keeps on incrementing. But during hot-unplug the counter
>  is
>  >      not decremented.
>  
>  Right so the translation cache is segmented by vCPU to support parallel JIT
>  operations. The easiest solution would be to ensure we dimension for the
>  maximum number of vCPUs, which it should already, see
>  tcg_init_machine():
>  
>    unsigned max_cpus = ms->smp.max_cpus;
>    ...
>    tcg_init(s->tb_size * MiB, s->splitwx_enabled, max_cpus);



Agreed. We have done that and have a patch for that as well. But it is still
a work-in-progress and I've lost context a bit.

https://github.com/salil-mehta/qemu/commit/107cf5ca7cf3716bc0f8c68e98e1da3939f449ce

For now, I've very quickly tried to enable and run the TCG to gain back the 
context.
I've now hit a different problem during TCG vCPU unrealization phase, while
pthread_join() waits on halt condition variable for MTTCG vCPU thread to exit,
there is a crash somewhere. Look like some race condition. Will dig this 
further.
 

Best regards
Salil.

>  > @ Feb 2023, [Linaro-open-discussions] Re: Qemu TCG support for
>  > virtual-cpuhotplug/online-policy
>  >
>  > https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-l
>  > ists.linaro.org/message/GMDFTEZE6WUUI7LZAYOWLXFHAPXLCND5/
>  >
>  > Last status reported by Miguel was that there was problem with the TCG
>  > and he intended to fix this. He was on paternity leave so I will try to 
> gather
>  the exact status of the TCG today.
>  >
>  > Thanks
>  > Salil
>  >
>  >
>  >>
>  >>  --
>  >>  Alex Bennée
>  >>  Virtualisation Tech Lead @ Linaro
>  
>  --
>  Alex Bennée
>  Virtualisation Tech Lead @ Linaro

RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch

Reply via email to