On 24 February 2012 18:29, Manuel Bouyer <bou...@antioche.eu.org> wrote: > On Fri, Feb 24, 2012 at 06:14:11PM +0530, Cherry G. Mathew wrote: >> On 24 February 2012 15:33, Manuel Bouyer <bou...@antioche.eu.org> wrote: >> > On Fri, Feb 24, 2012 at 03:00:03PM +0530, Cherry G. Mathew wrote: >> >> On 22 February 2012 18:31, Manuel Bouyer <bou...@antioche.eu.org> wrote: >> >> > On Wed, Feb 22, 2012 at 06:05:21PM +0530, Cherry G. Mathew wrote: >> >> >> >> >> >> I meant we could make it work, (it would already for amd64/xen since >> >> >> cpu_init_msrs() is called from cpu_hatch()) since xen has its own cpu.c >> >> > >> >> > i don't know if we can do the same for i386. >> >> >> >> It wasn't fun, but I managed to do it. >> >> >> >> btw, do you see a gdt page leaked between machdep.c:initgdt() and >> >> gdt.c:gdt_init() ? >> > >> > I can't see initgdt(), did you remove it ? >> >> No, it's right there: >> http://nxr.netbsd.org/xref/src/sys/arch/i386/i386/machdep.c#1096 > > OK, I was looking in amd64 code. > Yes, it's quite possible that we waste a page here. > >> >> > >> >> >> >> > Also xpq_cpu() is time-critical; I guess a function pointer call is >> >> > faster >> >> > than a test. >> >> >> >> Well, as a bonus of the early %gs/%fs setup now, I'm thinking of >> >> pruning the xpq_queue_update_xxx() in favour of pmap_set_xxx(). Also, >> >> I'll revisiting the atomicity guarantees (eg: pmap_pte_cas() of these >> >> functions, once we only start using them. >> > >> > AFAIK they're already all used by pmap ? >> > >> >> Mostly, but not everywere. > > there are places where they're not used on purpose (e.g. because we > know taking the lock or raising the IPL is not needed). >
I've made a few changes to pmap.c where it looks harmless to do so, but are in favour of consistency. ftp://ftp.netbsd.org/pub/NetBSD/misc/cherry/tmp/xen-set-pte.diff >> >> > What I want to look at is *why* they're used. In some case it's only >> > to collect PG_M/PG_D bits, and Xen has another, maybe more efficient >> > mechanism for that. This may allow us to batch more MMU updates. >> > >> > Also, I want to look using more multicalls. This may decrease the >> > number of hypercalls significantly. >> > >> >> I wonder if there's a way to align that with pmap(9) assumptions. >> Quoting the manpage: >> >> " In order to cope with hardware architectures that make the invalidation >> of virtual address mappings expensive (e.g., TLB invalidations, TLB >> shootdown operations for multiple processors), the pmap module is >> allowed >> to delay mapping invalidation or protection operations until such time >> as >> they are actually necessary. The functions that are allowed to delay >> such actions are pmap_enter(), pmap_remove(), pmap_protect(), >> pmap_kenter_pa(), and pmap_kremove(). Callers of these functions must >> use the pmap_update() function to notify the pmap module that the map- >> pings need to be made correct. Since the pmap module is provided with >> information as to which processors are using a given physical map, the >> pmap module may use whatever optimizations it has available to reduce >> the >> expense of virtual-to-physical mapping synchronization. >> " >> Since the XPQ can be regarded as a kind of TLB, I'm guessing we can >> attempt to marry the two apis ? > > This is more or less what I had in mind. But, for the cases where > we need atomic operations, pmap_update() is not appropriate ... > Very cool, Cheers, -- ~Cherry