On Tue, 2006-12-19 at 16:23 -0800, Linus Torvalds wrote: > > On Wed, 20 Dec 2006, Peter Zijlstra wrote: > > On Mon, 2006-12-18 at 12:14 -0800, Linus Torvalds wrote: > > > OR: > > > > > > - page_mkclean_one() is simply buggy. > > > > GOLD! > > Ok. I was looking at that, and I wondered.. > > However, if that works, then I _think_ the correct sequence is the > following.. > > The rule should be: > - we flush the tlb _after_ we have cleared it, but _before_ we insert the > new entry. > > But I dunno. These things are damn subtle. Does this patch fix it for you?
I will try, but I had a look around the different architectures implementation of ptep_clear_flush_dirty() and saw that not all do the actual flush. So if we go down this road perhaps we should introduce another per arch function that does the potential flush. like flush_tlb_on_clear_dirty() or something like that. Then we could write: entry = ptep_get_and_clear(mm, address, ptep) flush_tlb_on_clear_dirty(vma, address); entry = pte_mkclean(entry); entry = pte_wrprotect(entry); set_pte_at(mm, address, ptep, entry); > I actually suspect we should do this as an arch-specific macro, and > totally replace the current "ptep_clear_flush_dirty()" with one that does > "ptep_clear_flush_dirty_and_set_wp()". > > Because what I'd _really_ prefer to do on x86 (and probably on most other > sane architectures) is to do > > - atomically replace the pte with the EXACT SAME ONE, but one that > has the writable bit clear. > > bit_clear(_PAGE_BIT_RW, &(ptep)->pte_low); > > - flush the TLB, making sure that all CPU's will no longer write to it: > > flush_tlb_page(vma, address); > > - finally, just fetch-and-clear the dirty bit (and since it's no longer > writable, nobody should be settign it any more) > > ret = bit_clear(__PAGE_BIT_DIRTY, &(ptep)->pte_low); > > and now we should be all done. Hmm, should we not flush after clearing the dirty bit? That is, why does ptep_clear_flush_dirty() need a flush after clearing that bit? does it leak through in the tlb copy? Also, what is this page_test_and_clear_dirty() business, that seems to be exclusively s390 btw. However they do seem to need this. > But the "ptep_get_and_clear() + flush_tlb_page()" sequence should > hopefully also work. Yeah, probably, not optimally so on some archs that don't actually need the flush though. And as above, I wonder about s390. (added our s390 friends to the CC list) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/