> Before making additional modification, tidy up this misleading indentation.
Thanks for fixing it.
Reviewed-by: Ankit Agrawal
>>
>> [1] ACPI Spec 6.3, Section 5.2.16.6
>> [2] ACPI Spec 6.3, Table 5.80
>>
>> Cc: Jonathan Cameron
>> Cc: Alex Williamson
>> Cc: Cedric Le Goater
>> Signed-off-by: Ankit Agrawal
>
> I guess we gloss over the bisection breakage due to
uch.
>>
>> Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu [1]
>> Cc: Jonathan Cameron
>> Cc: Alex Williamson
>> Cc: Markus Armbruster
>> Acked-by: Markus Armbruster
>> Signed-off-by: Ankit Agrawal
>
> Hi Ankit,
>
>
3, Section 5.2.16.6
>> > [2] ACPI Spec 6.3, Table 5.80
>> >
>> > Signed-off-by: Ankit Agrawal
>>
>> One thing I forgot.
> And another :)
>
> It might be nice to also support x86 from the start (apparently people still
> care about that old architecture)
>
>>
>> Jonathan, Alex, do you know how we may add tests that is dependent
>> on the vfio-pci device?
>>
>
> As Jonathan notes, we've decoupled this from vfio-pci, the pci-dev= arg
> can point to any PCI device. For example, any emulated PCI NIC could
> be a stand-in for the vfio-pci device used in
>> >> Jonathan, Alex, do you know how we may add tests that is dependent
>> >> on the vfio-pci device?
>> >
>> > There are none.
>> >
>> > This would require a host device always available for passthrough and
>> > there is no simple solution for this problem. Such tests would need to
>> > run in a
>>> Please add a test. tests/qtest/bios-tables-test.c
>>> + relevant table dumps.
>>
>> Here I need to add a test that creates a vfio-pci device and numa
>> nodes and link using the acpi-generic-initiator object. One thing
>> here is that the -device vfio-pci needs a host= argument. I
>> probably
> One thing I forgot.
>
> Please add a test. tests/qtest/bios-tables-test.c
> + relevant table dumps.
Here I need to add a test that creates a vfio-pci device and numa
nodes and link using the acpi-generic-initiator object. One thing
here is that the -device vfio-pci needs a host= argument. I
pro
>> As for your suggestion of using acpi-dev as the arg to take both
>> pci-dev and acpi-dev.. Would that mean sending a pure pci device
>> (not the corner case you mentioned) through the acpi-dev argument
>> as well? Not sure if that would appropriate.
>
> Ah, looking up my description is unclear.
>>
>> One more thing. Right now we can't use Generic Initiators as
>> HMAT initiators. That also wants fixing given that's their
>> normal usecase rather than what you are using them for so it
>> should 'work'.
>
> Something along the lines of this will do the job.
Thanks! Will incorporate the p
>> >>> Jonathan, you pointed out interface design issues in your review of v2.>
>> >> Are you fully satisfied with the interface in v3?
>> >>
>> >> Yes. I'm fine with the interface in this version (though it's v7, so I'm
>> >> lost
>> >> on v2 vs v3!)
>> >
>> > Looks like I can't count to 7!
>> >
>> > One thing I forgot.
>> >
>> > Please add a test. tests/qtest/bios-tables-test.c
>>
>> IIUC, we need to add a test for aarch64 to test the interface with the
>> acpi-generic-initiator object.
>>
>> > + relevant table dumps.
>>
>> Sorry it isn't clear where do you want me to add this. In the gi
>> ---
>> hw/acpi/acpi-generic-initiator.c | 84
>> hw/arm/virt-acpi-build.c | 3 +
>> include/hw/acpi/acpi-generic-initiator.h | 26
> A few more comments.
>
> Maybe _ rather than - as more common for acpi include naming.
Ack. will change
> One thing I forgot.
>
> Please add a test. tests/qtest/bios-tables-test.c
IIUC, we need to add a test for aarch64 to test the interface with the
acpi-generic-initiator object.
> + relevant table dumps.
Sorry it isn't clear where do you want me to add this. In the git commit
message?
>>> Jonathan, you pointed out interface design issues in your review of v2.>
>> Are you fully satisfied with the interface in v3?
>>
>> Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
>> on v2 vs v3!)
>
> Looks like I can't count to 7!
>
> With NUMA capitalized in the
>> diff --git a/include/hw/acpi/acpi-generic-initiator.h
>> b/include/hw/acpi/acpi-generic-initiator.h
>> new file mode 100644
>> index 00..2f183b029a
>> --- /dev/null
>> +++ b/include/hw/acpi/acpi-generic-initiator.h
>
>> +typedef struct AcpiGenericInitiatorClass {
>> + ObjectClass
ec 6.3, Section 5.2.16.6
>> [2] ACPI Spec 6.3, Table 5.80
>>
>> Signed-off-by: Ankit Agrawal
>Hi Ankit,
>
> As the code stands the use of a list seems overkill.
Yeah, I will try out your suggestion.
> Otherwise looks good to me. I need Generic Ports support for CXL
>> >
>> > I'd find it helpful to see the resulting chunk of SRAT for these examples
>> > (disassembled) in this cover letter and the patches (where there are more
>> > examples).
>>
>> Ack. I'll document the resulting SRAT table as well.
>
> Still didn't happen so this is dropped for now.
Hi Mich
>> >
>> > Given that, an alternative proposal that I think would work
>> > for you would be to add a 'placeholder' memory node definition
>> > in SRAT (so allow 0 size explicitly - might need a new SRAT
>> > entry to avoid backwards compat issues).
>>
>> Putting all the PCI/GI/... complexity aside,
>> > However, I'll leave it up to those more familiar with the QEMU numa
>> > control interface design to comment on whether this approach is preferable
>> > to making the gi part of the numa node entry or doing it like hmat.
>>
>> > -numa srat-gi,node-id=10,gi-pci-dev=dev1
>>
>> The current way o
>> +##
>> +# @AcpiGenericInitiatorProperties:
>> +#
>> +# Properties for acpi-generic-initiator objects.
>> +#
>> +# @pci-dev: PCI device ID to be associated with the node
>> +#
>> +# @host-nodes: numa node list associated with the PCI device.
>
> NUMA
>
> Suggest "list of NUMA nodes associated wi
Had a discussion with RH folks, summary follows:
1. To align with the current spec description pointed by Jonathan, we first do
a separate object instance per GI node as suggested by Jonathan. i.e.
a acpi-generic-initiator would only link one node to the device. To
associate a set
>> However, I'll leave it up to those more familiar with the QEMU numa
>> control interface design to comment on whether this approach is preferable
>> to making the gi part of the numa node entry or doing it like hmat.
>> -numa srat-gi,node-id=10,gi-pci-dev=dev1
>
> The current way of acpi-generic
Thanks Jonathan for the review.
> As per reply to the cover letter I definitely want to see SRAT table dumps
> in here though so we can easily see what this is actually building.
Ack.
> I worry that some OS might make the assumption that it's one GI node
> per PCI device though. The language in
>>
>> -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
>> -numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
>> -numa node,nodeid=8 -numa node,nodeid=9 \
>> -device
>> vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
>> -object acpi-generic-init
Thanks Markus for the review.
>> Introduce a new acpi-generic-initiator object to allow host admin provide the
>> device and the corresponding NUMA nodes. Qemu maintain this association and
>> use this object to build the requisite GI Affinity Structure.
>
> Pardon my ignorance... What makes this
>> + for (l = gi->nodelist; l; l = l->next) {
>> + PCIDeviceHandle dev_handle = {0};
>> + PCIDevice *pci_dev = PCI_DEVICE(o);
>> + dev_handle.bdf =
>> PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
>> +
>> An admin can provide the range of nodes using numa-node-start and
>> numa-node-count and link it to a device by providing its id. The following
>> sample creates 8 nodes and link them to the device dev0:
>>
>> -numa node,nodeid=2 \
>> -numa node,nodeid=3 \
>> -numa node,
>> -device
>>vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
>> -object
>>nvidia-acpi-generic-initiator,id=gi0,device=dev0,numa-node-start=2,numa-node-count=8
>
> Why didn't we just implement start and count in the base object (or a
> list)? It seems l
>> +static void build_srat_generic_initiator_affinity(GArray *table_data, int
>> node,
>> + PCIDeviceHandle *handle,
>> + GenericAffinityFlags
>> flags)
>> +{
>> + build_append_int_noprefix(table_
>> +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v,
>> +const char *name, void *opaque,
>> +Error **errp)
>> +{
>> +AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
>> +uint32
> >
> > Based on the suggestions here, can we consider something like the
> > following?
> > 1. Introduce a new -numa subparam 'devnode', which tells Qemu to mark
> > the node with MEM_AFFINITY_HOTPLUGGABLE in the SRAT's memory affinity
> > structure to make it hotpluggable.
>
> Is that "devnode=on
>> Ok. I'd failed to register that the bare metal code was doing this though
>> with hindsight I guess that is obvious. Though without more info or
>> a bare metal example being given its also possible the BIOS was doing
>> enumeration etc (like CXL does for all < 2.0 devices) and hence was
>> buil
> With an ACPI spec clarification agreed then I'm fine with
> using this for all the cases that have come up in this thread.
> Or a good argument that this is valid in under existing ACPI spec.
Hi Jonathan
I looked at the Section 5.2.16 in ACPI spec doc, but could not see
any mention of whether s
Good idea. Fundamentally the device should not be creating NUMA
nodes, the VM should be configured with NUMA nodes and the device
memory associated with those nodes.
>>>
>>> +1. That would also make it fly with DIMMs and virtio-mem, where you
>>> would want NUMA-less nodes ass well
> >>
> >> Typically, a command line for a virt machine with NUMA nodes would
> >> look like :
> >>
> >> -object memory-backend-ram,id=ram-node0,size=1G \
> >> -numa node,nodeid=0,memdev=ram-node0 \
> >> -object memory-backend-ram,id=ram-node1,size=1G \
> >> -numa node,nodeid=1,cpus=
Hi Jonathan
> > +if (pcidev->pdev.has_coherent_memory) {
> > +uint64_t start_node = object_property_get_uint(obj,
> > + "dev_mem_pxm_start", &error_abort);
> > +uint64_t node_count = object_property_get_uint(obj,
> > +
> Also, good to say why multiple nodes per device are needed.
This is to support the GPU's MIG (Mult-Instance GPUs) feature,
(https://www.nvidia.com/en-in/technologies/multi-instance-gpu/) which
allows partitioning of the GPU device resources (including device memory) into
several isolated instance
38 matches
Mail list logo