Apologies for late response, wanted to check the code again.
On 10/07/2010 10:03 AM, Alan Cox wrote: > Alan Cox wrote: > At a high-level, I agree with much of what you say. In particular, if > pmap_enter() is applied to a virtual address that is already mapped by a large > page, the reported panic could result. However, barring bugs, for example, in > memory allocation by the upper levels of the kernel, the panic inducing > situation shouldn't occur. Calls to malloc() of items larger than a page takes a turn through UMA and eventually ends up in kmem_malloc() via its page_alloc() routine. Kmem_malloc() in turn gets the pages from vm_page_alloc(), parks them in the kmem_object and maps them into the kernel_pmap in a loop callings pmap_enter() for each page. The assigned VA's are pulled from kmem_map. Pages acquired through vm_page_alloc() may be backed by a super page reservation and thus are eligible for auto-promotion. Calls to free() initially take a similar route, ending up on kmem_free() via UMAs page_free() routine. From there the call path is vm_map_remove(), vm_map_remove(), vm_map_delete() to pmap_remove(). This logic indicate, that from the kernel/vm perspective the malloc/free() pair will map/unmap pages as needed. However, the pmapper never unmaps these pages as far as I can tell. The call path is pmap_remove(), pmap_remove_pte() to pmap_unuse_pt() who ignores the removal because the VA >= VM_MAXUSER_ADDRESS. Without superpages this works fine because pmap_enter() allow replacement of already mapped pages, but it kasserts on replacing a page that falls within a super page. I showed one way for this to happen in my first post. The problem is the safety checks for promotion are tricked into promoting before all pages of an allocation are pmap_entered(). > At a lower-level, it appears that you are misinterpreting what pmap_unuse_pt() > does. It is not pmap_unuse_pt()'s responsibility to clear page table entries, > either for user-space page tables or the kernel page table. When a region of > the kernel virtual address space is deallocated, we do clear the kernel page > table entries. See, for example, pmap_remove_pte() (or pmap_qremove()). I probably messed up the terminology in my post, mixing kernel_map and kmem_map. Also I'm ignoring for the moment where the page table pages come from, just observing that for VAs above user space, the PTE entries are not cleared by calling pmap_remove(). pmap_qremove() does, but it's not used by free(). > This special handling of the kernel page table does create another special > case when we destroy a large page mapping within the kernel address space. We > have to reinsert into the kernel page table the old page table page (for small > page mappings) that was sitting idle while the kernel page table held a large > page mapping in its place. At the same time, all of the page table entries in > this page must be invalidated. This is handled by pmap_remove_pde(). I never understood why invalidation is required following promotion/demotions. The mapping does not change, attributes are the same, so unless the hardware can't cope with co-existing 4K and 2M TLB entries then invalidation seems unnecessary (A&D bits may need to be handled, but that might be quicker than sending IPIs to invalidate). > I'm curious to know more about you were doing when you encountered this > panic. Were you also using ISO images containing large MFS roots? I saw it during sysctls I wrote to dump certain large kernel structures. The MO was to allocate really big formatting buffer using malloc(). Truth be said, I did not see this on 2MB super pages, mine were much smaller and therefore more prone to this. Sincerely, Kurt A _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"