Hi Ali,

On 3/10/25 5:23 PM, Alireza Sanaee via wrote:
> Specifying the cache layout in virtual machines is useful for
> applications and operating systems to fetch accurate information about
> the cache structure and make appropriate adjustments. Enforcing correct
> sharing information can lead to better optimizations. This patch enables
> the specification of cache layout through a command line parameter,
> building on a patch set by Intel [1,2,3]. It uses this set as a
some dependencies were merged. The series does not apply anymore.
> foundation.  The device tree and ACPI/PPTT table, and device tree are
> populated based on user-provided information and CPU topology.
this last sentence need some rewording.
>
> Example:
>
>
> +----------------+                            +----------------+
> |    Socket 0    |                            |    Socket 1    |
> |    (L3 Cache)  |                            |    (L3 Cache)  |
> +--------+-------+                            +--------+-------+
>          |                                             |
> +--------+--------+                            +--------+--------+
> |   Cluster 0     |                            |   Cluster 0     |
> |   (L2 Cache)    |                            |   (L2 Cache)    |
> +--------+--------+                            +--------+--------+
>          |                                             |
> +--------+--------+  +--------+--------+    +--------+--------+  
> +--------+----+
> |   Core 0         | |   Core 1        |    |   Core 0        |  |   Core 1   
>  |
> |   (L1i, L1d)     | |   (L1i, L1d)    |    |   (L1i, L1d)    |  |   (L1i, 
> L1d)|
> +--------+--------+  +--------+--------+    +--------+--------+  
> +--------+----+
>          |                   |                       |                   |
> +--------+              +--------+              +--------+          +--------+
> |Thread 0|              |Thread 1|              |Thread 1|          |Thread 0|
> +--------+              +--------+              +--------+          +--------+
> |Thread 1|              |Thread 0|              |Thread 0|          |Thread 1|
> +--------+              +--------+              +--------+          +--------+
>
>
> The following command will represent the system relying on **ACPI PPTT 
> tables**.
>
> ./qemu-system-aarch64 \
>  -machine 
> virt,smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=cluseter,smp-
s/cluseter/cluster
> cache.3.cache=l3,smp-cache.3.topology=socket \
>  -cpu max \
>  -m 2048 \
>  -smp sockets=2,clusters=1,cores=2,threads=2 \
>  -kernel ./Image.gz \
>  -append "console=ttyAMA0 root=/dev/ram rdinit=/init acpi=force" \
>  -initrd rootfs.cpio.gz \
>  -bios ./edk2-aarch64-code.fd \
>  -nographic
>
> The following command will represent the system relying on **the device 
> tree**.
>
> ./qemu-system-aarch64 \
>  -machine 
> virt,smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=cluseter,smp-cache.3.cache=l3,smp-cache.3.topology=socket
>  \
>  -cpu max \
>  -m 2048 \
>  -smp sockets=2,clusters=1,cores=2,threads=2 \
>  -kernel ./Image.gz \
>  -append "console=ttyAMA0 root=/dev/ram rdinit=/init acpi=off" \
>  -initrd rootfs.cpio.gz \
>  -nographic
>
> Failure cases:
>     1) There are scenarios where caches exist in systems' registers but
>     left unspecified by users. In this case qemu returns failure.
Can you give more details on 1)? is it a TCG case or does it also exist
with KVM acceleration?
>
>     2) SMT threads cannot share caches which is not very common. More
>     discussions here [4].
>
> Currently only three levels of caches are supported to be specified from
> the command line. However, increasing the value does not require
> significant changes. Further, this patch assumes l2 and l3 unified
> caches and does not allow l(2/3)(i/d). The level terminology is
> thread/core/cluster/socket right now. Hierarchy assumed in this patch:
> Socket level = Cluster level + 1 = Core level + 2 = Thread level + 3;
>
> TODO:
>   1) Making the code to work with arbitrary levels
>   2) Separated data and instruction cache at L2 and L3.
>   3) Additional cache controls.  e.g. size of L3 may not want to just
>   match the underlying system, because only some of the associated host
>   CPUs may be bound to this VM.
Does it mean this is more an RFC or do you plan to send improvement
patches once this series gets upstream?

Thanks

Eric
>
> [1] https://lore.kernel.org/kvm/20240908125920.1160236-1-zhao1....@intel.com/
> [2] 
> https://lore.kernel.org/qemu-devel/20241101083331.340178-1-zhao1....@intel.com/
> [3] 
> https://lore.kernel.org/qemu-devel/20250110145115.1574345-1-zhao1....@intel.com/
> [4] 
> https://lore.kernel.org/devicetree-spec/20250203120527.3534-1-alireza.san...@huawei.com/
>
> Change Log:
>   v7->v8:
>    * rebase: Merge tag 'pull-nbd-2024-08-26' of https://repo.or.cz/qemu/ericb 
> into staging
>    * I mis-included a file in patch #4 and I removed it in this one.
>
>   v6->v7:
>    * Intel stuff got pulled up, so rebase.
>    * added some discussions on device tree.
>
>   v5->v6:
>    * Minor bug fix.
>    * rebase based on new Intel patchset.
>      - 
> https://lore.kernel.org/qemu-devel/20250110145115.1574345-1-zhao1....@intel.com/
>
>   v4->v5:
>     * Added Reviewed-by tags.
>     * Applied some comments.
>
>   v3->v4:
>     * Device tree added.
>
> Depends-on: Building PPTT with root node and identical implementation flag
> Depends-on: Msg-id: 20250306023342.508-1-alireza.san...@huawei.com
>
> Alireza Sanaee (6):
>   target/arm/tcg: increase cache level for cpu=max
>   arm/virt.c: add cache hierarchy to device tree
>   bios-tables-test: prepare to change ARM ACPI virt PPTT
>   hw/acpi/aml-build.c: add cache hierarchy to pptt table
>   tests/qtest/bios-table-test: testing new ARM ACPI PPTT topology
>   Update the ACPI tables according to the acpi aml_build change, also
>     empty bios-tables-test-allowed-diff.h.
>
>  hw/acpi/aml-build.c                        | 205 +++++++++++-
>  hw/arm/virt-acpi-build.c                   |   8 +-
>  hw/arm/virt.c                              | 350 +++++++++++++++++++++
>  hw/cpu/core.c                              |  92 ++++++
>  hw/loongarch/virt-acpi-build.c             |   2 +-
>  include/hw/acpi/aml-build.h                |   4 +-
>  include/hw/arm/virt.h                      |   4 +
>  include/hw/cpu/core.h                      |  27 ++
>  target/arm/tcg/cpu64.c                     |  13 +
>  tests/data/acpi/aarch64/virt/PPTT.topology | Bin 356 -> 540 bytes
>  tests/qtest/bios-tables-test.c             |   4 +
>  11 files changed, 701 insertions(+), 8 deletions(-)
>


Reply via email to