On 29.02.16 17:52, Stephen Warren wrote: > On 02/27/2016 05:09 AM, Alexander Graf wrote: >> >> >> On 26.02.16 20:25, Stephen Warren wrote: >>> On 02/25/2016 09:36 AM, Alexander Graf wrote: >>>> >>>> >>>> On 24.02.16 19:14, Stephen Warren wrote: >>>>> On 02/24/2016 05:11 AM, Alexander Graf wrote: >>>>>> The idea to generate our pages tables from an array of memory ranges >>>>>> is very sound. However, instead of hard coding the code to create up >>>>>> to 2 levels of 64k granule page tables, we really should just create >>>>>> normal 4k page tables that allow us to set caching attributes on 2M >>>>>> or 4k level later on. >>>>>> >>>>>> So this patch moves the full_va mapping code to 4k page size and >>>>>> makes it fully flexible to dynamically create as many levels as >>>>>> necessary for a map (including dynamic 1G/2M pages). It also adds >>>>>> support to dynamically split a large map into smaller ones when >>>>>> some code wants to set dcache attributes. >>>>>> >>>>>> With all this in place, there is very little reason to create your >>>>>> own page tables in board specific files. >>>>>> >>>>>> Signed-off-by: Alexander Graf <ag...@suse.de> >>>>> >>>>>> +/* >>>>>> + * This is a recursively called function to count the number of >>>>>> + * page tables we need to cover a particular PTE range. If you >>>>>> + * call this with level = -1 you basically get the full 48 bit >>>>>> + * coverage. >>>>>> + */ >>>>>> +static int count_required_pts(u64 addr, int level, u64 maxaddr) >>>>> >>>>> I think this looks correct now. Nits below if a respin is needed for >>>>> other reasons. >>>>> >>>>>> +{ >>>>>> + int levelshift = level2shift(level); >>>>>> + u64 levelsize = 1ULL << levelshift; >>>>>> + u64 levelmask = levelsize - 1; >>>>>> + u64 levelend = addr + levelsize; >>>>>> + int r = 0; >>>>>> + int i; >>>>>> + bool is_level = false; >>>>> >>>>> I might suggest renaming that is_level_needed. It's not obvious to me >>>>> exactly what the name "is_level" is intended to represent; the name >>>>> seems to represent whether something (unspecified) is a level or not. >>>> >>>> We're basically asking the function whether a PTE for address <addr> at >>>> level <level> would be an inval/block/level PTE. is_level marks it as a >>>> level pte. >>>> >>>> I could maybe rename this into pte_type and create an enum that is >>>> either PTE_INVAL, PTE_BLOCK or PTE_LEVEL. >>>> >>>>> >>>>>> + >>>>>> + for (i = 0; i < ARRAY_SIZE(mem_map); i++) { >>>>>> + struct mm_region *map = &mem_map[i]; >>>>>> + u64 start = map->base; >>>>>> + u64 end = start + map->size; >>>>>> + >>>>>> + /* Check if the PTE would overlap with the map */ >>>>>> + if (max(addr, start) <= min(levelend, end)) { >>>>>> + start = max(addr, start); >>>>>> + end = min(levelend, end); >>>>>> + >>>>>> + /* We need a sub-pt for this level */ >>>>>> + if ((start & levelmask) || (end & levelmask)) { >>>>>> + is_level = true; >>>>>> + break; >>>>>> + } >>>>>> + >>>>>> + /* Lv0 can not do block PTEs, so do levels here too */ >>>>>> + if (level <= 0) { >>>>>> + is_level = true; >>>>>> + break; >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + /* >>>>>> + * Block PTEs at this level are already covered by the parent >>>>>> page >>>>>> + * table, so we only need to count sub page tables. >>>>>> + */ >>>>>> + if (is_level) { >>>>>> + int sublevel = level + 1; >>>>>> + u64 sublevelsize = 1ULL << level2shift(sublevel); >>>>>> + >>>>>> + /* Account for the new sub page table ... */ >>>>>> + r = 1; >>>>> >>>>> "Account for the page table at /this/ level"? This may represent the >>>>> top-level (0/1) page table, not just sub page tables. The sub-level >>>>> accounting is via the recursive call to count_required_pts() I >>>>> believe. >>>> >>>> I think the easiest way to visualize what's going on is if we start >>>> with >>>> level -1. >>>> >>>> We basically ask the function at level -1 whether a PTE at level -1 (48 >>>> bits) would fit into a block PTE at level -1 (so the PTE spans all 48 >>>> bits) or whether we need to create a new level entry plus page table >>>> for it. >>> >>> Hmm, I had overlooked that the call to count_required_pts() passed >>> "start_level - 1" not "start_level". Now that I see that, your >>> explanation makes more sense to me. >>> >>> It'd be nice if some/all of this explanation were comments in the code >>> to aid readers. >>> >>> That said, I would have expected a slightly more direct calculation; why >>> not: >>> >>> int start_level = 0; // or 1 if small VA space >>> total_pts = count_required_pts(start_level); >> >> We need to account for the count twice, to have an (immutable) copy of >> our page tables around when we split them. > > I think that's just a hard-coded "* 2" there, unless I'm missing your > point?
Yes :). > >>> total_pts += 4; // "random" number to account for later splits >>> >>> int count_required_pts(int level, ...) { >> >> ... includes the address, as I have it today, I guess? >> >>> int npts = 1; // the table at this level >>> for each pte slot at this level: >>> if a mem_map entry starts/stops within the pte slot: >>> npts += count_required_pts(level + 1, ...); >> >> This means we would count some levels multiple times. Imagine two >> entries from >> >> 1G - 1.25G >> 1.25G - 1.5G >> >> With your pseudo-code, we would count for both cases while checking for >> 1G page table entries. Hence we need to jump out of the loop here and do >> the counting outside. > > I don't think so; "if a mem_map entry starts/stops within the pte slot" > is a single decision for that PT entry at the current level, not a loop > that recurses once per memory block. Hence, you'd only ever recurse > once, so there's no double-counting. Ah, "if a" means another loop over all mem_maps, eventually saying "yes, I am overlapping". Either way, I am not fully convinced the code would end up much more readable than it is today, but I'll be happy to get persuaded otherwise :). And yes, that's definitely something we can still work on down the road in future follow-up patches. Alex _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot