On Sat, Jan 13, 2018 at 12:02:26PM +0800, Baoquan He wrote:
>On 01/12/18 at 01:52pm, Luiz Capitulino wrote:
>> On Fri, 12 Jan 2018 10:47:53 +0800
>> Chao Fan <fanc.f...@cn.fujitsu.com> wrote:
>> 
>> > On Fri, Jan 12, 2018 at 10:31:52AM +0800, Baoquan He wrote:
>> > >On 01/11/18 at 10:04am, Kees Cook wrote:  
>> > >> On Thu, Jan 11, 2018 at 1:00 AM, Baoquan He <b...@redhat.com> wrote:  
>> > >> > Hi Luiz,
>> > >> >
>> > >> > On 01/04/18 at 11:21am, Luiz Capitulino wrote:  
>> > >> >> Having a generic kaslr parameter to control where the kernel is 
>> > >> >> extracted
>> > >> >> is one solution for this problem.
>> > >> >>
>> > >> >> The general problem statement is that KASLR may break some kernel 
>> > >> >> features
>> > >> >> depending on where the kernel is extracted. Two examples are 
>> > >> >> hot-plugged
>> > >> >> memory (this series) and 1GB HugeTLB pages.
>> > >> >>
>> > >> >> The 1GB HugeTLB page issue is not specific to KVM guests. It just 
>> > >> >> happens
>> > >> >> that there's a bunch of people running guests with up to 5GB of 
>> > >> >> memory and
>> > >> >> with that amount of memory you have one or two 1GB pages and is 
>> > >> >> easier for
>> > >> >> KASLR to extract the kernel into a 1GB region and split a 1GB page. 
>> > >> >> So,
>> > >> >> you may not get any 1GB pages at all when this happens. However, I 
>> > >> >> can also
>> > >> >> reproduce this on bare-metal with lots of memory where I can loose a 
>> > >> >> 1GB
>> > >> >> page from time to time.
>> > >> >>
>> > >> >> Having a kaslr_range= parameter solves both issues, but two major 
>> > >> >> drawbacks
>> > >> >> is that it breaks existing setups and I guess users will have a very 
>> > >> >> hard
>> > >> >> time choosing good ranges.
>> > >> >>
>> > >> >> Another idea would be to have a CONFIG_KASLR_RANGES, where each arch
>> > >> >> could have a list of ranges known to contain holes and/or immovable
>> > >> >> memory and only extract the kernel into those ranges.  
>> > >> >
>> > >> > If add CONFIG_KASLR_RANGES, then a distro like RHEL will have this 
>> > >> > range
>> > >> > always, whether people need hugetlb or not.
>> > >> >
>> > >> > So in this case, what range do we need to avoid? Only [1G, 2G]?  
>> > >> 
>> > >> Any ranges like that that need to be avoided should be known at build
>> > >> time, so they should simply be added to the mem_avoid list that is
>> > >> already present in the KASLR code...  
>> > >
>> > >Seems KASLR doesn't have an solution which allow user to specify avoided
>> > >range for kernel text KASLR stage only. The memmap="!#$" can add range to
>> > >mem_avoid, while it will make them not added to e820.
>> > >  
>> > 
>> > How about adding a new option, like "huge_page=nn@ss". Fill the regions
>> > to mem_avoid. But this parameter will only be parsed in kaslr period.
>> > The followed handlling of memmap will not be excuted.
>> 
>> If we add a new option, I think we should try to make general enough
>> to satisfy both hugepages and the memory hotplug problem. Otherwise
>> we'll end up adding a new option for each feature KASLR breaks...
>
>Yes, this is my concern. We can take advantage of this opportunity to
>make it.
>
>> 
>> However, in the case of the 1GB page problem, I'm starting to think
>> that it may be possible to know which 1GB areas are already fragmented
>> and extract the kernel to one of those areas. I don't know if this would
>> help the memory hotplug issue though.

Hi Luiz,

Before this patchset, I ever try to parse ACPI SRAT table to get the
detailed memory information, then filter the movable regions.
But the code is too heavy. So I changed my method like Baoquan said.

>
>This is also the thing Chao is trying to solve. Since user may not
>know how to get those hotplugable memory region, Chao is trying to add a
>sysfs interface to export them which are extracted from ACPI SRAT.
>Wonder if hugetlb can do the similar.
>
>And the hugetlb issue only exists in 4G memory size of system, right?
>For large memory system, no such problem.
>

Hi Baoquan,

I also wonder this problem.
I asked Luiz in the email. Since the mem_avoid limit the amount of
regions, so I asked Luiz how many 1G huge pages does system need.
He said free area may vary depending on amount of memory, devices, etc.

So in my personal understanding, if there is a machine with the memory
is 6G, and 2 suitable position for 1G huge page, and system need 2
huge pages, so the bug will also happen.

Well, if there is a large number of memory, there will be many suitable
regions, KASLR will break only one suitable region, so we don't need
care this bug. But I wonder the boundary of these two situations.
What's the limited counts of this issue.

Thanks,
Chao Fan

>Thanks
>Baoquan
>
>


Reply via email to