On Tue, 2008-07-15 at 16:58 -0500, Kumar Gala wrote: > Ben, > > I've been giving some thought to the new software managed TLBs and SMP > issues. I was wondering if you had any insights on how we should deal > with the following issues:
As discussed on IRC (might interest others...) > * tlb invalidates -- need to ensure we don't have multiple tlbsync's > on the bus. I'm thinking for e500/fsl we will move to IPI based > invalidate broadcast and do invalidates locally > (http://patchwork.ozlabs.org/linuxppc/patch?id=19657 ) Well, you can just have all your invalidations wrapped in a spinlock. The "trick" of course is for full-mm invalidates such as page tables teardown or fork, to avoid doing a lock/unlock & IPI for every PTE of course. A way to do it is to do some batching, though it isn't trivial. Without support for TLB invalidate all or by PID, what you can do maybe is to manually do an invalidate by PID with a tlbre/tlbwe loop. Check the worst case scenario of walking your entire TLB vs. small processes that carry only a handful of PTEs.... You can use the batch interface to 'count' things on page table teardown and decide based on a threshold of invalidated PTEs what approach is more likely to be useful, but can't really use the batch interface for fork. > * 64-bit PTEs and reader vs writer hazards. How do we ensure that the > TLB miss handler samples a consistent view of the pte. pte_updates > seem ok since we only update the flag word. However set_pte_at seems > like it could be problematic. eieio on the writer and a data dependency on the reader. segher suggested a nice way to do it on the reader side, by doing a subf of the value from the pointer and then a lwxz using that value as an offset. ie. something like that, with r3 containing the PTE pointer: lwz r10,4(r3) subf r4,r10,r3 <-- you can use r3,r10,r3 if clobber is safe lwzx r11,r10,r4 <-- in which case you use r3 here too That ensures that the top half is loaded after the bottom half, which is what you want if you do the set_pte_at() that way: stw r11,0(r3) <-- write top half first eieio <-- maitain order to coherency domain stw r10,4(r3) <-- write bottom half last In fact, in the reader case, while at it, you can interleave that with the testing of the present bit. Assuming _PAGE_PRESENT is in the low bits and you can clobber r3, you get something like: lwz r10,4(r3) <-- can't do much here unless you can do unrelated things --> andi. r0,r10,_PAGE_PRESENT subf r3,r10,r3 beq page_fault lwzx r11,r10,r3 Cheers, Ben. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev