[combining two messages and adding kib and alc to cc per Oliver Pinter] >> the CPU's CR4 on entry to the kernel. >It is %cr3.
Oops, well, easily fixed. :-) >> (If we used bcopy() to copy the kernel pmap's NKPML4E and NDMPML4E >> entries into the new pmap, the L3 pages would not have to be >> physically contiguous, but the KVA ones would still all have to >> exist. It's free to allocate physically contiguous pages here >> anyway though.) >I do not see how the physical continuity of the allocated page table >pages is relevant there. Not in create_pagetables(), no, but later in pmap_pinit(), which has loops to set pmap->pm_pml4[x] for kernel and and direct-map. And: >Copying the L4 or L3 PTEs would cause serious complications. Perhaps what I wrote was a little fuzzy. Here's the pmap_pinit() code I was referring to, as modified (the original version has only the second loop -- it assumes NKPML4E is always 1 so it just sets pml4[KPML4I]): pmap->pm_pml4 = (pml4_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(pml4pg)); if ((pml4pg->flags & PG_ZERO) == 0) pagezero(pmap->pm_pml4); for (i = 0; i < NKPML4E; i++) { pmap->pm_pml4[KPML4BASE + i] = (KPDPphys + (i << PAGE_SHIFT)) | PG_RW | PG_V | PG_U; } for (i = 0; i < NDMPML4E; i++) { pmap->pm_pml4[DMPML4I + i] = (DMPDPphys + (i << PAGE_SHIFT)) | PG_RW | PG_V | PG_U; } /* install self-referential address mapping entry(s) */ These require that KPDPphys and DMPDPphys both point to the first of n physically-contiguous pages. But suppose we did this (this is deliberately simple for illustration, and furthermore I am assuming here that vmspace0 never acquires any user-level L4 entries): pmap->pm_pml4 = (pml4_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(pml4pg)); /* Clear any junk and wire in kernel global address entries. */ bcopy(vmspace0.vm_pmap.pm_pml4, pmap->pm_pml4, NBPG); /* install self-referential address mapping entry(s) */ Now whatever we set up in create_pagetables() is simply copied to new (user) pmaps, so we could go totally wild if we wanted. :-) >> So, the last NKPML4E slots in KPML4phys point to the following >> page tables, which use all of L3, L2, and L1 style PTEs. (Note >> that we did not need any L1 PTEs for the direct map, which always >> uses 2MB or 1GB super-pages.) >This is not quite true. In the initial state, indeed all PTEs for direct >map are superpages, either 1G or 2M. But Intel states that a situation >when the physical page has mappings with different caching modes causes >undefined behaviour. As result, if a page is remapped with non-write >back caching attributes, the direct map has to demote the superpage and >adjust the mapping attribute of the page frame for the page. Yes, this particular bit of description was restricted to the setup work in create_pagetables(). (Perhaps I should take out "always", or substitute "initially"?) Also, I think I left out a description of the loop where some KPDphys entries are overwritten with 2MB mappings. >> +AMD64_HUGE opt_global.h >Is this option needed ? The SMAP is already parsed at the time of >pmap_bootstrap() call, so you could determine the amount of physical >memory and size the KVA map accordingly ? Mostly I was afraid of the consequences on VM_MIN_KERNEL_ADDRESS, which is #included so widely, and any complaints people might have about: - wasting NKPML4E-2 (i.e., 14) pages on small AMD64 systems (for the new empty L3 pages in KPDphys that will likely not be used); - "wasting" yet another page because dynamic memory will start at the first new L3 page (via KPML4BASE) instead of just using the KMPL4I'th one because VM_MIN_KERNEL_ADDRESS is now at -8TB instead of -.5TB -- with VM_MIN_KERNEL_ADDRESS at -.5TB, all KVAs use the single KMPL4I'th slot; - wasting 30 more pages because NDMPML4E grew from 2 to 32; and - adding a loop to set up NKPML4E entries in every pmap, instead of the single "shove KPDphys into one slot" code that used to be there, and making the pmap_pinit loop run 32 times instead of just 2 for the direct map. Adding these up, the option chews up 45 pages, or 180 kbytes, when compared to the current setup (1 TB direct map, .5 TB kernel VM). 180 kbytes is pretty trivial if you're planning to have a couple of terabytes of RAM, but on a tiny machine ... of course if it's that tiny you could run as i386, in 32 bit mode. :-) If we copied the kernel's L4 table to new pmaps -- or even just put in a new "ndmpdpphys" variable -- we could avoid allocating any pages for DMPDPphys that we know won't actually be used. That would fix the "30 extra" pages above, and even regain one page on many amd64 setups (those with <= 512 GB). We'd be down to just 14 extra pages = 56 kbytes, and the new loop in pmap_pinit(). Here's prototype code for sizing DMPDPphys, for illustration: old: DMPDPphys = allocpages(firstaddr, NDMPML4E); new: ndmpdpphys = howmany(ndmpdp, NPML4EPG); if (ndmpdpphys > NDMPML4E) panic("something or other"); /* or shrink to fit? */ DMPDPphys = allocpages(firstaddr, ndmpdpphys); and then instead of connecting NDMPML4E pages, connect ndmpdpphys of them. Would that break anything? The direct mapped VA range is known in advance; if you get a bad value it will "look" direct- mapped, but now not have an L3 page under it, whereas before it would always have an L3 page, just no L2 page. (Offhand, I think this would only affect pmap_enter(), and calling that for an invalid physical address would be bad anyway.) [I'm also not sure if we might be able to tweak the KPTphys usage slightly to eliminate whole pages full of L1 PTEs, e.g., if the GENERIC kernel occupies about 15 MB, we can map it with 7 2MB big page entries in KPDphys, then just one "regular" PTE-page and 256 "regular" PTEs in the first actually-used page of KPTphys. (This would recover another 7 pages in this particular example.) But this would at least affect pmap_init()'s loop over nkpt entries to initialize the vm page array entries that describe the KPTphys area, so I did not attempt it.] Chris _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"