2010/2/26 Anton Maksimenkov <[email protected]>: > The idea might be like this. We don't use uvm_map_hint(). In > uvm_map_findspace() we get entry with biggest space. Then we generate > random from range 0...MIN(space-length, 256M). Then we search first > entry with space >= length+that_random. It surely exists, at least the > entry with biggest space. And at the end in uvm_map_findspace() we set .. > *result = entry->end + that_random;
The devil in details. To achieve that simplicity we need two (or just first one, see below) additional trees - uvm_tree_space_exec and uvm_tree_space_data: - uvm_tree_space_exec will contain entries with entry->start less than I386_MAX_EXE_ADDR. It will be defined only on i386 for now. - uvm_tree_space_data will contain entries with entry->start more or equal to MAXDSIZ (or BRKSIZ on vax, just as uvm_map_hint() do). For kernel virtual space it may be used to contain all it's entries. Then we can incorporate uvm_map_hint()'s job into uvm_map_findspace(). We can extract vm_prot_t prot = UVM_PROTECTION(flags). When uvm_map_findspace() will search for address it will not use "hint" (except for the UVM_FLAG_FIXED case). It will check "prot", as uvm_map_hint() do. On i386, if (prot & VM_PROT_EXECUTE) then it will search for "length" space in uvm_tree_space_exec. If not, then it will search in uvm_tree_space_data. But not for kernel virtual space - for this purpose the "full" uvm_tree_space will be used (see the diff in first mail). Or we can keep all kernel entries in uvm_tree_space_data, and do not use redundant uvm_tree_space at all. We need to decide how we will do randomization (but not for kernel virtual space for now; and not for vax). Previous version generates a "hint" and tree/list need to be scanned for entry with entry->end more that "hint". It is extra work, I try to find how to eliminate it. I try to find a way to do a "one shot search". We can generate random "that_random" in range 0...256M (or any) or, for (prot & VM_PROT_EXECUTE), in range 0...I386_MAX_EXE_ADDR/2. "that_random" will be used as "offset" for va. Then we search for entry with space >= length + that_random. If found, we set *result = entry->end + that_random; end return it. It is possible that this algorithm may start to fail earlier than current algorithm (in the end, it is all random). It is because current algorithm just generate a "hint" and don't care if the address will contain some offset from previous entry's end or not. New algorithm will always do random offset, as you see. I want to ask - what if we limit random range by 0...MIN(256M, (maxspace_data - length))? The maxspace_data is the maximum space in uvm_tree_space_data. IT can be found from entry with the biggest space or from (entry->start - MAXDSIZ) of the first entry with "start" > MAXDSIZ, if it that space will be bigger. And so 0...MIN(I386_MAX_EXE_ADDR/2, (maxspace_exec-length)) for uvm_tree_space_exec. Yes, it will reduce randomness when address space becomes more and more fragmented and there will be no hole more than 256M + length. But it still be a random! Generated "that_random" may become very small - we get some entry with small space, or bigger - we get some other with bigger space, and so on. Probably it is almost the same value of randomity as current algorithm provide. Of course, if requested length will be less than maxspace. If requested length will always be as much as maxspace the algorithm will always return an entry with maxspace. Only if there are more than one entry with same maxspace we can randomly choose one of them. But current version will do almost the same. If random "hint" will be less than entry with maxspace it will surely get this entry (entry with maxspace). If "hint" will be more than this entry (last such entry, if there are more than one with equal maxspace) - allocation just fail. There is a difference. Current algorithm may fail even if there IS entry with enough space (if that space will be layed lower than random "hint"). New algorithm will always successfully find this space. And if space > length it will always put random gap before allocated VA. Yes, that gap may become very small (or even zero) if requested length is as big as space. But current algorithm _may_ place same or less gap, or much more likely it will not place any gap (just set hint=entry->end) or fail. I might be not 100% correct somewhere, but let's discuss it. What do you think? -- antonvm
