On Fri, Feb 24, 2012 at 06:14:11PM +0530, Cherry G. Mathew wrote: > On 24 February 2012 15:33, Manuel Bouyer <bou...@antioche.eu.org> wrote: > > On Fri, Feb 24, 2012 at 03:00:03PM +0530, Cherry G. Mathew wrote: > >> On 22 February 2012 18:31, Manuel Bouyer <bou...@antioche.eu.org> wrote: > >> > On Wed, Feb 22, 2012 at 06:05:21PM +0530, Cherry G. Mathew wrote: > >> >> > >> >> I meant we could make it work, (it would already for amd64/xen since > >> >> cpu_init_msrs() is called from cpu_hatch()) since xen has its own cpu.c > >> > > >> > i don't know if we can do the same for i386. > >> > >> It wasn't fun, but I managed to do it. > >> > >> btw, do you see a gdt page leaked between machdep.c:initgdt() and > >> gdt.c:gdt_init() ? > > > > I can't see initgdt(), did you remove it ? > > No, it's right there: > http://nxr.netbsd.org/xref/src/sys/arch/i386/i386/machdep.c#1096
OK, I was looking in amd64 code. Yes, it's quite possible that we waste a page here. > > > > >> > >> > Also xpq_cpu() is time-critical; I guess a function pointer call is > >> > faster > >> > than a test. > >> > >> Well, as a bonus of the early %gs/%fs setup now, I'm thinking of > >> pruning the xpq_queue_update_xxx() in favour of pmap_set_xxx(). Also, > >> I'll revisiting the atomicity guarantees (eg: pmap_pte_cas() of these > >> functions, once we only start using them. > > > > AFAIK they're already all used by pmap ? > > > > Mostly, but not everywere. there are places where they're not used on purpose (e.g. because we know taking the lock or raising the IPL is not needed). > > > What I want to look at is *why* they're used. In some case it's only > > to collect PG_M/PG_D bits, and Xen has another, maybe more efficient > > mechanism for that. This may allow us to batch more MMU updates. > > > > Also, I want to look using more multicalls. This may decrease the > > number of hypercalls significantly. > > > > I wonder if there's a way to align that with pmap(9) assumptions. > Quoting the manpage: > > " In order to cope with hardware architectures that make the invalidation > of virtual address mappings expensive (e.g., TLB invalidations, TLB > shootdown operations for multiple processors), the pmap module is allowed > to delay mapping invalidation or protection operations until such time as > they are actually necessary. The functions that are allowed to delay > such actions are pmap_enter(), pmap_remove(), pmap_protect(), > pmap_kenter_pa(), and pmap_kremove(). Callers of these functions must > use the pmap_update() function to notify the pmap module that the map- > pings need to be made correct. Since the pmap module is provided with > information as to which processors are using a given physical map, the > pmap module may use whatever optimizations it has available to reduce the > expense of virtual-to-physical mapping synchronization. > " > Since the XPQ can be regarded as a kind of TLB, I'm guessing we can > attempt to marry the two apis ? This is more or less what I had in mind. But, for the cases where we need atomic operations, pmap_update() is not appropriate ... -- Manuel Bouyer <bou...@antioche.eu.org> NetBSD: 26 ans d'experience feront toujours la difference --