On Thu, Jun 12, 2025 at 02:43:34PM +0100, Jonathan Cameron wrote:
> v15:
>   - Split the address map calculations and mmio setup into separate
>     functions in patch 2, allowing v14 patch 3 to be dropped as not
>     x86 and arm make the same calls.  Note I felt this was a sufficient
>     change to trigger dropping tags. (Zhijian Li)
>   - A few other minor tweaks.
>   - TLB issue mentioned in v14 now fixed upstream so dropped reference
>     in this cover letter.
> 
> Thanks to Itaru Kitayama and Zhijian Li for testing + reviews.
> 
> Updated cover letter
> 
> Back in 2022, this series stalled on the absence of a solution to device
> tree support for PCI Expander Bridges (PXB) and we ended up only having
> x86 support upstream. I've been carrying the arm64 support out of tree
> since then, with occasional nasty surprises (e.g. UNIMP + DT issue seen
> a few weeks ago) and a fair number of fiddly rebases.
> gitlab.com/jic23/qemu cxl-<latest date>.  Will update shortly with this
> series.
> 
> A recent discussion with Peter Maydell indicated that there are various
> other ACPI only features now, so in general he might be more relaxed
> about DT support being necessary. The upcoming vSMMUv3 support would
> run into this problem as well.
> 
> I presented the background to the PXB issue at Linaro connect 2022. In
> short the issue is that PXBs steal MMIO space from the main PCI root
> bridge. The challenge is knowing how much to steal.
> 
> On ACPI platforms, we can rely on EDK2 to perform an enumeration and
> configuration of the PCI topology and QEMU can update the ACPI tables
> after EDK2 has done this when it can simply read the space used by the
> root ports. On device tree, there is no entity to figure out that
> enumeration so we don't know how to size the stolen region.
> 
> Three approaches were discussed:
> 1) Enumerating in QEMU. Horribly complex and the last thing we want is a
>    3rd enumeration implementation that ends up out of sync with EDK2 and
>    the kernel (there are frequent issues because of how those existing
>    implementations differ.
> 2) Figure out how to enumerate in kernel. I never put a huge amount of work
>    into this, but it seemed likely to involve a nasty dance with similar
>    very specific code to that EDK2 is carrying and would very challenging
>    to upstream (given the lack of clarity on real use cases for PXBs and
>    DT).
> 3) Hack it based on the control we have which is bus numbers.
>    No one liked this but it worked :)
> 
> The other little wrinkle would be the need to define full bindings for CXL
> on DT + implement a fairly complex kernel stack as equivalent in ACPI
> involves a static table, CEDT, new runtime queries via _DSM and a description
> of various components. Doable, but so far there is no interest on physical
> platforms. Worth noting that for now, the QEMU CXL emulation is all about
> testing and developing the OS stack, not about virtualization (performance
> is terrible except in some very contrived situations!)
> 
> There is only a very simple test in here, because my intent is not to
> duplicate what we have on x86, but just to do a smoke test that everything
> is hooked up.  In general we need much more comprehensive end to end CXL
> tests but that requires a reaonsably stable guest software stack. A few
> people have expressed interest in working on that, but we aren't there yet.
> 
> Note that this series has a very different use case to that in the proposed
> SBSA-ref support:
> https://lore.kernel.org/qemu-devel/20250117034343.26356-1-wangyuquan1...@phytium.com.cn/
> 
> SBSA-ref is a good choice if you want a relatively simple mostly fixed
> configuration.  That works well with the limited host system
> discoverability etc as EDK2 can be build against a known configuration.
> 
> My interest with this support in arm/virt is support host software stack
> development (we have a wide range of contributors, most of whom are working
> on emulation + the kernel support). I care about the weird corners. As such
> I need to be able to bring up variable numbers of host bridges, multiple CXL
> Fixed Memory Windows with varying characteristics (interleave etc), complex
> NUMA topologies with wierd performance characteristics etc. We can do that
> on x86 upstream today, or my gitlab tree. Note that we need arm support
> for some arch specific features in the near future (cache flushing).
> Doing kernel development with this need for flexibility on SBSA-ref is not
> currently practical. SBSA-ref CXL support is an excellent thing, just
> not much use to me for this work.
> 
> Also, we are kicking off some work on DCD virtualization, particularly to
> support inter-host shared memory being presented up into a VM. That
> will need upstream support on arm64 as it is built on top of the existing
> CXL emulation to avoid the need for a separate guest software stack.
> 
> Note this is TCG only - it is possible to support limited use with KVM but
> that needs additional patches not yet ready for upstream.  The challenge
> is interleave - and the solution is don't interleave if you want to run
> with KVM.

One of the ndctl:cxl tests fails (other tests ran ok):

# meson test cxl-region-sysfs.sh
ninja: Entering directory `/root/ndctl/build'
[1/55] Generating version.h with a custom command
[  706.564783][ T2080] calling  cxl_port_init+0x0/0xfe0 [cxl_port] @ 2080
[  706.566861][ T2080] initcall cxl_port_init+0x0/0xfe0 [cxl_port] returned 0 
after 1735 usecs
[  706.586457][ T2080] calling  cxl_acpi_init+0x0/0xfe0 [cxl_acpi] @ 2080
[  706.625690][ T2080] probe of port1 returned 0 after 25381 usecs
[  706.626634][ T2080]  pci0000:bf: host supports CXL
[  706.653573][ T2080] probe of port2 returned 0 after 25631 usecs
[  706.655164][ T2080]  pci0000:35: host supports CXL
[  706.662409][ T2080] probe of ACPI0017:00 returned 0 after 74464 usecs
[  706.663150][ T2080] initcall cxl_acpi_init+0x0/0xfe0 [cxl_acpi] returned 0 
after 76306 usecs
[  706.690482][ T2080] calling  cxl_pmem_init+0x0/0xfd0 [cxl_pmem] @ 2080
[  706.695324][ T2080] probe of ndbus0 returned 0 after 1496 usecs
[  706.699217][ T2080] probe of nvdimm-bridge0 returned 0 after 6705 usecs
[  706.702372][ T2080] initcall cxl_pmem_init+0x0/0xfd0 [cxl_pmem] returned 0 
after 11576 usecs
[  706.717668][ T2080] calling  cxl_mem_driver_init+0x0/0xfe0 [cxl_mem] @ 2080
[  706.758561][ T2080] probe of port3 returned 0 after 34188 usecs
[  706.767080][ T2080] cxl_nvdimm pmem11: GPF: could not set dirty shutdown 
state
[  706.779392][ T2080] probe of nmem0 returned 0 after 1083 usecs
[  706.782181][ T2080] probe of pmem11 returned 0 after 15516 usecs
[  706.826941][ T2080] probe of endpoint4 returned 0 after 42630 usecs
[  706.827987][ T2080] probe of mem11 returned 0 after 108475 usecs
[  706.878052][ T2080] probe of port5 returned 0 after 41354 usecs
[  706.938260][ T2080] probe of endpoint6 returned 0 after 46831 usecs
[  706.939223][ T2080] probe of mem12 returned 0 after 105104 usecs
[  706.994611][ T2080] probe of endpoint7 returned 0 after 49337 usecs
[  706.995790][ T2080] probe of mem13 returned 0 after 53632 usecs
[  707.004334][ T2080] cxl_nvdimm pmem14: GPF: could not set dirty shutdown 
state
[  707.017782][ T2080] probe of nmem1 returned 0 after 1115 usecs
[  707.019324][ T2080] probe of pmem14 returned 0 after 14920 usecs
[  707.072148][ T2080] probe of endpoint8 returned 0 after 50887 usecs
[  707.073367][ T2080] probe of mem14 returned 0 after 71279 usecs
[  707.079062][ T2080] initcall cxl_mem_driver_init+0x0/0xfe0 [cxl_mem] 
returned 0 after 361073 usecs
[  707.111533][ T2080] calling  cxl_test_init+0x0/0xc88 [cxl_test] @ 2080
[  708.001403][ T2080] platform cxl_host_bridge.0: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.002032][ T2080] platform cxl_host_bridge.1: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.010988][ T2080] platform cxl_host_bridge.2: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.011963][ T2080] platform cxl_host_bridge.3: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.034604][ T2080] platform cxl_host_bridge.0: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.056775][ T2080] probe of port10 returned 0 after 20555 usecs
[  708.057814][ T2080] platform cxl_host_bridge.0: host supports CXL
[  708.062226][ T2080] platform cxl_host_bridge.1: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.081857][ T2080] probe of port11 returned 0 after 18696 usecs
[  708.085821][ T2080] platform cxl_host_bridge.1: host supports CXL
[  708.086496][ T2080] platform cxl_host_bridge.2: Unsupported platform config, 
mixed Virtual Host and Restricted CXL Host hierarchy.
[  708.538268][ T2080] probe of port12 returned 0 after 450064 usecs
[  708.563248][ T2080] platform cxl_host_bridge.2: host supports CXL
[  708.563875][ T2080] platform cxl_host_bridge.3: host supports CXL 
(restricted)
[  708.803640][ T2080] probe of ndbus1 returned 0 after 87373 usecs
[  708.817241][ T2080] probe of nvdimm-bridge1 returned 0 after 182992 usecs
[  708.839172][ T2080] probe of cxl_acpi.0 returned 0 after 843615 usecs
[  709.026867][  T503] cxl_mock_mem cxl_mem.0: CXL MCE unsupported
[  709.240435][  T502] cxl_mock_mem cxl_mem.1: CXL MCE unsupported
[  709.263055][  T499] cxl_mock_mem cxl_mem.2: CXL MCE unsupported
[  709.317257][  T503] probe of port13 returned 0 after 57147 usecs
[  709.442524][  T498] cxl_mock_mem cxl_mem.3: CXL MCE unsupported
[  709.495789][  T499] probe of port15 returned 0 after 57022 usecs
[  709.538513][ T1514] cxl_mock_mem cxl_mem.5: CXL MCE unsupported
[  709.553021][  T503] probe of nmem3 returned 0 after 12329 usecs
[  709.555876][  T503] probe of pmem0 returned 0 after 54954 usecs
[  709.567823][  T499] probe of nmem2 returned 0 after 27577 usecs
[  709.569359][  T499] probe of pmem2 returned 0 after 64505 usecs
[  709.603845][  T497] cxl_mock_mem cxl_mem.4: CXL MCE unsupported
[  709.626487][   T12] cxl_mock_mem cxl_mem.6: CXL MCE unsupported
[  709.639194][  T502] probe of port14 returned 0 after 255539 usecs
[  709.662671][  T503] probe of region2 returned 6 after 421 usecs
[  709.664855][  T503] cxl_mock_mem cxl_mem.0: Extended linear cache 
calculation failed rc:-2
[  709.694975][  T503] probe of endpoint16 returned 0 after 100427 usecs
[  709.698678][  T503] probe of mem0 returned 0 after 516964 usecs
[  709.752821][   T49] cxl_mock_mem cxl_mem.7: CXL MCE unsupported
[  709.782050][  T499] probe of endpoint17 returned 0 after 102692 usecs
[  709.814422][  T499] probe of mem2 returned 0 after 539843 usecs
[  709.859496][  T497] probe of nmem4 returned 0 after 74368 usecs
[  709.860064][  T497] probe of pmem5 returned 0 after 120134 usecs
[  709.862431][  T499] probe of cxl_mem.2 returned 0 after 686512 usecs
[  709.863290][  T503] probe of cxl_mem.0 returned 0 after 892734 usecs
[  709.870924][   T30] cxl_mock_mem cxl_mem.9: CXL MCE unsupported
[  709.876631][ T2080] initcall cxl_test_init+0x0/0xc88 [cxl_test] returned 0 
after 2764528 usecs
[  709.886776][  T498] probe of port18 returned 0 after 168122 usecs
[  709.900934][  T500] cxl_mock_mem cxl_mem.8: CXL MCE unsupported
[  709.946498][ T1514] probe of nmem5 returned 0 after 9462 usecs
[  709.947381][ T1514] probe of pmem4 returned 0 after 14445 usecs
[  709.970022][   T12] probe of nmem6 returned 0 after 32603 usecs
[  709.978337][  T498] probe of nmem7 returned 0 after 35875 usecs
[  709.986071][  T498] probe of pmem3 returned 0 after 46583 usecs
[  710.010717][   T12] probe of pmem6 returned 0 after 75369 usecs
[  710.014337][  T501] cxl_mock_mem cxl_rcd.10: CXL MCE unsupported
[  710.040337][  T502] probe of nmem8 returned 0 after 29862 usecs
[  710.059653][  T502] probe of pmem1 returned 0 after 88935 usecs
[  710.079573][   T12] probe of endpoint23 returned 0 after 49198 usecs
[  710.097280][   T30] probe of port21 returned 0 after 120461 usecs
[  710.097393][   T49] probe of nmem9 returned 0 after 26437 usecs
[  710.101820][   T49] probe of pmem7 returned 0 after 36944 usecs
[  710.106202][   T12] probe of mem6 returned 0 after 452293 usecs
[  710.130816][   T12] probe of cxl_mem.6 returned 0 after 669975 usecs
[  710.133959][  T500] probe of nmem10 returned 0 after 12351 usecs
[  710.170179][  T500] probe of pmem9 returned 0 after 50316 usecs
[  710.183178][  T498] probe of endpoint22 returned 0 after 160579 usecs
[  710.208790][  T498] probe of mem3 returned 0 after 760551 usecs
[  710.212463][  T501] probe of endpoint24 returned 0 after 149792 usecs
[  710.236975][  T497] probe of dax2.0 returned 0 after 118646 usecs
[  710.240952][  T497] probe of dax_region2 returned 0 after 166940 usecs
[  710.242545][  T498] probe of cxl_mem.3 returned 0 after 917362 usecs
[  710.256278][   T30] probe of nmem11 returned 0 after 45572 usecs
[  710.257079][   T30] probe of pmem8 returned 0 after 105691 usecs
[  710.259332][  T501] probe of mem10 returned 0 after 218194 usecs
[  710.265545][  T497] probe of region2 returned 0 after 225563 usecs
[  710.269232][ T1514] probe of endpoint20 returned 0 after 320269 usecs
[  710.282299][  T497] probe of endpoint19 returned 0 after 421116 usecs
[  710.283234][  T497] probe of mem5 returned 0 after 642268 usecs
[  710.304734][ T1514] probe of mem4 returned 0 after 762768 usecs
[  710.322940][   T49] probe of endpoint26 returned 0 after 117975 usecs
[  710.324643][   T49] probe of mem7 returned 0 after 558032 usecs
[  710.336434][  T501] probe of cxl_rcd.10 returned 0 after 414904 usecs
[  710.339624][  T497] probe of cxl_mem.4 returned 0 after 957535 usecs
[  710.400496][ T1514] probe of cxl_mem.5 returned 0 after 996364 usecs
[  710.450320][   T49] probe of cxl_mem.7 returned 0 after 842769 usecs
[  710.648851][  T500] probe of endpoint25 returned 0 after 477479 usecs
[  710.659334][  T500] probe of mem9 returned 0 after 690185 usecs
[  710.706030][  T500] probe of cxl_mem.8 returned 0 after 917194 usecs
[  711.162896][   T30] probe of endpoint28 returned 0 after 493899 usecs
[  711.227336][   T30] probe of mem8 returned 0 after 1278462 usecs
[  711.325568][  T502] probe of endpoint27 returned 0 after 929403 usecs
[  711.356687][  T502] probe of mem1 returned 0 after 2101415 usecs
[  711.531055][   T30] probe of cxl_mem.9 returned 0 after 1707787 usecs
[  711.554073][  T502] probe of cxl_mem.1 returned 0 after 2425696 usecs
[  724.421245][ T2077] probe of region5 returned 6 after 262 usecs
1/1 ndctl:cxl / cxl-region-sysfs.sh        FAIL            18.22s   exit status 
1
>>> TEST_PATH=/root/ndctl/build/test 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> LD_LIBRARY_PATH=/root/ndctl/build/daxctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/ndctl/lib
>>>  DAXCTL=/root/ndctl/build/daxctl/daxctl DATA_PATH=/root/ndctl/test 
>>> NDCTL=/root/ndctl/build/ndctl/ndctl MESON_TEST_ITERATION=1 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  MALLOC_PERTURB_=123 /bin/bash /root/ndctl/test/cxl-region-sysfs.sh

The kernel (the cxl_test kernel module) is built off of cxl/next which
has Jonathan's fix to the cxl_test seen on arm64.
Could the experts take a look at this issue?

Thanks,
Itaru.

> 
> Jonathan Cameron (4):
>   hw/cxl-host: Add an index field to CXLFixedMemoryWindow
>   hw/cxl: Make the CXL fixed memory windows devices.
>   hw/arm/virt: Basic CXL enablement on pci_expander_bridge instances
>     pxb-cxl
>   qtest/cxl: Add aarch64 virt test for CXL
> 
>  include/hw/arm/virt.h     |   4 +
>  include/hw/cxl/cxl.h      |   5 +-
>  include/hw/cxl/cxl_host.h |   5 +-
>  hw/acpi/cxl.c             |  76 +++++++++--------
>  hw/arm/virt-acpi-build.c  |  34 ++++++++
>  hw/arm/virt.c             |  29 +++++++
>  hw/cxl/cxl-host-stubs.c   |   7 +-
>  hw/cxl/cxl-host.c         | 170 +++++++++++++++++++++++++++++++-------
>  hw/i386/pc.c              |  50 +++++------
>  tests/qtest/cxl-test.c    |  59 ++++++++++---
>  tests/qtest/meson.build   |   1 +
>  11 files changed, 330 insertions(+), 110 deletions(-)
> 
> -- 
> 2.48.1
> 

Reply via email to