On Mon, 06 Mar 2023 16:50:29 +0000 David Woodhouse <dw...@infradead.org> wrote:
> On Mon, 2023-03-06 at 23:39 +0700, Bui Quang Minh wrote: > > On 3/6/23 22:51, David Woodhouse wrote: > > > On Mon, 2023-03-06 at 11:43 +0100, Igor Mammedov wrote: > > > > > However, there are still problems while trying to extending support to > > > > > APIC ID larger than 255 because there are many places assume APIC ID > > > > > is > > > > > 8-bit long. > > > > > > > > that's what I was concerned about (i.e. just enabling x2apic without > > > > fixing > > > > with all code that just assumes 8bit apicid). > > > > > > Even before you extend the physical APIC IDs past 254 or 255, there's > > > still the issue that 255 isn't a broadcast in X2APIC mode. And that > > > *logical* addressing will use more than 8 bits even when the physical > > > IDs are lower. > > > > > > > > One of that is interrupt remapping which returns 32-bit > > > > > destination ID but uses MSI (which has 8-bit destination) to send to > > > > > APIC. I will look more into this. > > > > > > The translated 'output' from the interrupt remapping doesn't "use MSI", > > > in the sense of a write transaction which happens to go to addresses in > > > the 0x00000000FEExxxxx range. The *input* to interrupt remapping comes > > > from that intercept. > > > > > > The output is already "known" to be an MSI message, with a full 64-bit > > > address and 32-bit data, and the KVM API puts the high 24 bits of the > > > target APIC ID into the high word of the address. > > > > > > If you look at the "generic" x86_iommu_irq_to_msi_message() in > > > hw/intc/x86-iommu.c you'll see it's also using the same trick: > > > > > > msg.__addr_hi = irq->dest & 0xffffff00; > > > > Yeah, I see that trick too, with this hunk and temporarily disable > > broadcast destination id 0xff in physical mode, I am able to boot Linux > > with 255 CPUs (I can't see how to use few CPUs but configure with high > > APIC ID) > > I never worked out how to explicitly assign high APIC IDs but you can > at least spread them out by explicitly setting the topology to > something weird like sockets=17,cores=3,threads=3 so that some APIC IDs > get skipped. > > Of course, that doesn't let you exercise the interesting corner case of > physical APIC ID 0xff though. I wonder if there's a way of doing it > such that only CPU#0 and CPU#255 are *online* at boot, even if the rest > theoretically exist? you can have arbitrary (withing -smp limits) vcpu at startup time by using -device foo-cpu-type,topo-ids-here (modulo auto-created ones on behalf -smp X value) Possible vcpus for given -M/-smp/-cpu combination one can get using hotpluggable-cpus HMP command or its QMP counterpart. > > @@ -814,7 +816,12 @@ static void apic_send_msi(MSIMessage *msi) > > { > > uint64_t addr = msi->address; > > uint32_t data = msi->data; > > - uint8_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> > > MSI_ADDR_DEST_ID_SHIFT; > > + uint32_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> > > MSI_ADDR_DEST_ID_SHIFT; > > + /* > > + * The higher 3 bytes of destination id is stored in higher word of > > + * msi address. See x86_iommu_irq_to_msi_message() > > + */ > > + dest = dest | (addr >> 32); > > > > I am reading the manual now, looks like broadcast destination id in > > x2APIC is 0xffff_ffff in both physical and logic mode. > > Yep, that looks about right.