On 2011-05-18 18:33, Anthony Liguori wrote: > On 05/18/2011 10:23 AM, Avi Kivity wrote: >>> The tricky part is wiring this up efficiently for TCG, ie. in QEMU's >>> softmmu. I played with passing the issuing CPUState (or NULL for >>> devices) down the MMIO handler chain. Not totally beautiful as >>> decentralized dispatching was still required, but at least only >>> moderately invasive. Maybe your API allows for cleaning up the >>> management and dispatching part, need to rethink... >> >> My suggestion is opposite - have a different MemoryRegion for each (e.g. >> CPUState::memory). Then the TLBs will resolve to a different ram_addr_t >> for the same physical address, for the local APIC range. > > I don't understand the different ram_addr_t part. > > The TLB should dispatch to a per-CPU dispatch table. The per-CPU should > dispatch almost everything to a global dispatch table. > > The global dispatch table is the chipset (Northbridge/Southbridge). > > The chipset can then dispatch to individual busses which can then > further dispatch as appropriate. > > Overlapping regions can be handled differently at each level. For > instance, if a PCI device registers an IO region to the same location as > the APIC, the APIC always wins because the PCI bus will never see the > access. > > You cannot do this properly with a single dispatch table because the > behavior depends on where in the hierarchy the I/O is being handled.
Ah, now I remember why I did not follow that path: Not invasiveness, but performance concerns. I assume TLB refills have their share in TCG performance, and adding another lookup layer, probably for every target, will be measurable. I was wondering if that is worth the, granted, cleaner design. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux