On 10/23/2017 10:52 AM, C.Wehrmeyer wrote:
> On 2017-10-23 18:57, Michal Hocko wrote:
>> On Mon 23-10-17 18:46:59, C.Wehrmeyer wrote:
>>> On 23-10-17 18:13, Michal Hocko wrote:
>>>> On Mon 23-10-17 16:00:13, C.Wehrmeyer wrote:
>>>>> And just to be very sure I've added:
>>>>>
>>>>> if (madvise(buf1,ALLOC_SIZE_1,MADV_HUGEPAGE)) {
>>>>>           errno_tmp = errno;
>>>>>           fprintf(stderr,"madvise: %u\n",errno_tmp);
>>>>>           goto out;
>>>>> }
>>>>>
>>>>> /*Make sure the mapping is actually used*/
>>>>> memset(buf1,'!',ALLOC_SIZE_1);
>>>>
>>>> Is the buffer aligned to 2MB?
>>>
>>> When I omit MAP_HUGETLB for the flags that mmap receives - no.
>>>
>>> #define ALLOC_SIZE_1 (2 * 1024 * 1024)
>>> [...]
>>> buf1 = mmap (
>>>          NULL,
>>>          ALLOC_SIZE_1,
>>>          prot, /*PROT_READ | PROT_WRITE*/
>>>          flags /*MAP_PRIVATE | MAP_ANONYMOUS*/,
>>>          -1,
>>>          0
>>> );
>>>
>>> In such a case buf1 usually contains addresses which are aligned to 4 KiBs,
>>> such as 0x7f07d76e9000. 2-MiB-aligned addresses, such as 0x7f89f5e00000, are
>>> only produced with MAP_HUGETLB - which, if I understood the documentation
>>> correctly, is not the point of THPs as they are supposed to be transparent.
>>
>> yes. You can use posix_memalign
> 
> Useless. We don't use the memory allocation structures of malloc/free, and 
> yet that's exactly what this function requires us to do. The reason why we 
> use mmap and mremap is to get rid of userspace-crap in the first place.
> 
>> or you can mmap a larger block and
>> munmap the initial unaligned part.
> 
> And how is that supposed to be transparent? When I hear "transparent" I think 
> of a mechanism which I can put under a system so that it benefits from it, 
> while the system does not notice or at least does not need to be aware of it. 
> The system also does not need to be changed for it.
> 
> This approach is even more un-transparent than providing a flag to mmap in 
> order to make hugepages work correctly.

Well at least this has a built in fall back mechanism.  When using hugetlb(fs)
pages, you would need to handle the case where mremap fails due to lack of
configured huge pages.

I assume your allocator will be for somewhat general application usage.  Yet,
for the most reliability the user/admin will need to know at boot time how
many huge pages will be needed and set that up.

-- 
Mike Kravetz

Reply via email to