Re: [PATCH v2 0/9] slab: Introduce dedicated bucket allocator
On 2024/03/05 18:10, Kees Cook wrote: > Hi, > > Repeating the commit logs for patch 4 here: > > Dedicated caches are available For fixed size allocations via > kmem_cache_alloc(), but for dynamically sized allocations there is only > the global kmalloc API's set of buckets available. This means it isn't > possible to separate specific sets of dynamically sized allocations into > a separate collection of caches. > > This leads to a use-after-free exploitation weakness in the Linux > kernel since many heap memory spraying/grooming attacks depend on using > userspace-controllable dynamically sized allocations to collide with > fixed size allocations that end up in same cache. > > While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense > against these kinds of "type confusion" attacks, including for fixed > same-size heap objects, we can create a complementary deterministic > defense for dynamically sized allocations. > > In order to isolate user-controllable sized allocations from system > allocations, introduce kmem_buckets_create(), which behaves like > kmem_cache_create(). (The next patch will introduce kmem_buckets_alloc(), > which behaves like kmem_cache_alloc().) So can I say the vision here would be to make all the kernel interfaces that handles user space input to use separated caches? Which looks like creating a "grey zone" in the middle of kernel space (trusted) and user space (untrusted) memory. I've also thought that maybe hardening on the "border" could be more efficient and targeted than a mitigation that affects globally, e.g. CONFIG_RANDOM_KMALLOC_CACHES.
Re: [PATCH v2 0/9] slab: Introduce dedicated bucket allocator
On 2024/03/08 4:31, Kees Cook wrote: > On Wed, Mar 06, 2024 at 09:47:36AM +0800, GONG, Ruiqi wrote: >> >> >> On 2024/03/05 18:10, Kees Cook wrote: >>> Hi, >>> >>> Repeating the commit logs for patch 4 here: >>> >>> Dedicated caches are available For fixed size allocations via >>> kmem_cache_alloc(), but for dynamically sized allocations there is only >>> the global kmalloc API's set of buckets available. This means it isn't >>> possible to separate specific sets of dynamically sized allocations into >>> a separate collection of caches. >>> >>> This leads to a use-after-free exploitation weakness in the Linux >>> kernel since many heap memory spraying/grooming attacks depend on using >>> userspace-controllable dynamically sized allocations to collide with >>> fixed size allocations that end up in same cache. >>> >>> While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense >>> against these kinds of "type confusion" attacks, including for fixed >>> same-size heap objects, we can create a complementary deterministic >>> defense for dynamically sized allocations. >>> >>> In order to isolate user-controllable sized allocations from system >>> allocations, introduce kmem_buckets_create(), which behaves like >>> kmem_cache_create(). (The next patch will introduce >>> kmem_buckets_alloc(), >>> which behaves like kmem_cache_alloc().) >> >> So can I say the vision here would be to make all the kernel interfaces >> that handles user space input to use separated caches? Which looks like >> creating a "grey zone" in the middle of kernel space (trusted) and user >> space (untrusted) memory. I've also thought that maybe hardening on the >> "border" could be more efficient and targeted than a mitigation that >> affects globally, e.g. CONFIG_RANDOM_KMALLOC_CACHES. > > I think it ends up having a similar effect, yes. The more copies that > move to memdup_user(), the more coverage is created. The main point is to > just not share caches between different kinds of allocations. The most > abused version of this is the userspace size-controllable allocations, > which this targets. I agree. Currently if we want to fulfill a more strict separation between user-space manageable memory and other memory in kernel space, technically speaking for fixed size allocations we could transform them into using dedicated caches (i.e. kmem_cache_create()), but for dynamic size allocations I don't think of any solution. With the APIs provided by this patch set, we've got something that works. > ... The existing caches (which could still be used for > type confusion attacks when the sizes are sufficiently similar) have a > good chance of being mitigated by CONFIG_RANDOM_KMALLOC_CACHES already, > so this proposed change is just complementary, IMO. Maybe in the future we could require that all user-kernel interfaces that make use of SLAB caches should use either kmem_cache_create() or kmem_buckets_create()? ;) > > -Kees >
[PATCH v2 1/2] slab: Adjust placement of __kvmalloc_node_noprof
Move __kvmalloc_node_noprof (and also kvfree* for consistency) into mm/slub.c so that it can directly invoke __do_kmalloc_node, which is needed for the next patch. Move kmalloc_gfp_adjust to slab.h since now its two callers are in different .c files. No functional changes intended. Signed-off-by: GONG Ruiqi --- include/linux/slab.h | 22 + mm/slub.c| 90 ++ mm/util.c| 112 --- 3 files changed, 112 insertions(+), 112 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 09eedaecf120..0bf4cbf306fe 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -1101,4 +1101,26 @@ size_t kmalloc_size_roundup(size_t size); void __init kmem_cache_init_late(void); void __init kvfree_rcu_init(void); +static inline gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size) +{ + /* +* We want to attempt a large physically contiguous block first because +* it is less likely to fragment multiple larger blocks and therefore +* contribute to a long term fragmentation less than vmalloc fallback. +* However make sure that larger requests are not too disruptive - no +* OOM killer and no allocation failure warnings as we have a fallback. +*/ + if (size > PAGE_SIZE) { + flags |= __GFP_NOWARN; + + if (!(flags & __GFP_RETRY_MAYFAIL)) + flags |= __GFP_NORETRY; + + /* nofail semantic is implemented by the vmalloc fallback */ + flags &= ~__GFP_NOFAIL; + } + + return flags; +} + #endif /* _LINUX_SLAB_H */ diff --git a/mm/slub.c b/mm/slub.c index 1f50129dcfb3..0830894bb92c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4878,6 +4878,96 @@ void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags) } EXPORT_SYMBOL(krealloc_noprof); +/** + * __kvmalloc_node - attempt to allocate physically contiguous memory, but upon + * failure, fall back to non-contiguous (vmalloc) allocation. + * @size: size of the request. + * @b: which set of kmalloc buckets to allocate from. + * @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL. + * @node: numa node to allocate from + * + * Uses kmalloc to get the memory but if the allocation fails then falls back + * to the vmalloc allocator. Use kvfree for freeing the memory. + * + * GFP_NOWAIT and GFP_ATOMIC are not supported, neither is the __GFP_NORETRY modifier. + * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is + * preferable to the vmalloc fallback, due to visible performance drawbacks. + * + * Return: pointer to the allocated memory of %NULL in case of failure + */ +void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) +{ + void *ret; + + /* +* It doesn't really make sense to fallback to vmalloc for sub page +* requests +*/ + ret = __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, b), + kmalloc_gfp_adjust(flags, size), + node); + if (ret || size <= PAGE_SIZE) + return ret; + + /* non-sleeping allocations are not supported by vmalloc */ + if (!gfpflags_allow_blocking(flags)) + return NULL; + + /* Don't even allow crazy sizes */ + if (unlikely(size > INT_MAX)) { + WARN_ON_ONCE(!(flags & __GFP_NOWARN)); + return NULL; + } + + /* +* kvmalloc() can always use VM_ALLOW_HUGE_VMAP, +* since the callers already cannot assume anything +* about the resulting pointer, and cannot play +* protection games. +*/ + return __vmalloc_node_range_noprof(size, 1, VMALLOC_START, VMALLOC_END, + flags, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP, + node, __builtin_return_address(0)); +} +EXPORT_SYMBOL(__kvmalloc_node_noprof); + +/** + * kvfree() - Free memory. + * @addr: Pointer to allocated memory. + * + * kvfree frees memory allocated by any of vmalloc(), kmalloc() or kvmalloc(). + * It is slightly more efficient to use kfree() or vfree() if you are certain + * that you know which one to use. + * + * Context: Either preemptible task context or not-NMI interrupt. + */ +void kvfree(const void *addr) +{ + if (is_vmalloc_addr(addr)) + vfree(addr); + else + kfree(addr); +} +EXPORT_SYMBOL(kvfree); + +/** + * kvfree_sensitive - Free a data object containing sensitive information. + * @addr: address of the data object to be freed. + * @len: length of the data object. + * + * Use the special memzero_explicit() function to clear the content of a + * kvmalloc'ed object containing sensitive data to make sure that the + * compiler won't optimize out the data clearing. + */ +void kvfree_sensitive(const void
[PATCH v2 2/2] slab: Achieve better kmalloc caches randomization in kvmalloc
As revealed by this writeup[1], due to the fact that __kmalloc_node (now renamed to __kmalloc_node_noprof) is an exported symbol and will never get inlined, using it in kvmalloc_node (now is __kvmalloc_node_noprof) would make the RET_IP inside always point to the same address: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __kmalloc_node_noprof __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to That literally means all kmalloc invoked via kvmalloc would use the same seed for cache randomization (CONFIG_RANDOM_KMALLOC_CACHES), which makes this hardening unfunctional. The root cause of this problem, IMHO, is that using RET_IP only cannot identify the actual allocation site in case of kmalloc being called inside wrappers or helper functions. And I believe there could be similar cases in other functions. Nevertheless, I haven't thought of any good solution for this. So for now let's solve this specific case first. For __kvmalloc_node_noprof, replace __kmalloc_node_noprof and call __do_kmalloc_node directly instead, so that RET_IP can take the return address of kvmalloc and differentiate each kvmalloc invocation: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to Thanks to Tamás Koczka for the report and discussion! Link: https://github.com/google/security-research/pull/83/files#diff-1604319b55a48c39a210ee52034ed7ff5b9cdc3d704d2d9e34eb230d19fae235R200 [1] Reported-by: Tamás Koczka Signed-off-by: GONG Ruiqi --- mm/slub.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 0830894bb92c..46e884b77dca 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4903,9 +4903,9 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) * It doesn't really make sense to fallback to vmalloc for sub page * requests */ - ret = __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, b), - kmalloc_gfp_adjust(flags, size), - node); + ret = __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), + kmalloc_gfp_adjust(flags, size), + node, _RET_IP_); if (ret || size <= PAGE_SIZE) return ret; -- 2.25.1
[PATCH v2 0/2] Refine kmalloc caches randomization in kvmalloc
Hi, v2: change the implementation as Vlastimil suggested v1: https://lore.kernel.org/all/20250122074817.991060-1-gongrui...@huawei.com/ Tamás reported [1] that kmalloc cache randomization doesn't actually work for those kmalloc invoked via kvmalloc. For more details, see the commit log of patch 2. The current solution requires a direct call from __kvmalloc_node_noprof to __do_kmalloc_node, a static function in a different .c file. Comparing to v1, this version achieves this by simply moving __kvmalloc_node_noprof to mm/slub.c, as suggested by Vlastimil [2]. Link: https://github.com/google/security-research/pull/83/files#diff-1604319b55a48c39a210ee52034ed7ff5b9cdc3d704d2d9e34eb230d19fae235R200 [1] Link: https://lore.kernel.org/all/62044279-0c56-4185-97f7-7afac65ff...@suse.cz/ [2] GONG Ruiqi (2): slab: Adjust placement of __kvmalloc_node_noprof slab: Achieve better kmalloc caches randomization in kvmalloc include/linux/slab.h | 22 + mm/slub.c| 90 ++ mm/util.c| 112 --- 3 files changed, 112 insertions(+), 112 deletions(-) -- 2.25.1
[PATCH] mm/slab: Achieve better kmalloc caches randomization in kvmalloc
As revealed by this writeup[1], due to the fact that __kmalloc_node (now renamed to __kmalloc_node_noprof) is an exported symbol and will never get inlined, using it in kvmalloc_node (now is __kvmalloc_node_noprof) would make the RET_IP inside always point to the same address: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __kmalloc_node_noprof __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to That literally means all kmalloc invoked via kvmalloc would use the same seed for cache randomization (CONFIG_RANDOM_KMALLOC_CACHES), which makes this hardening unfunctional. The root cause of this problem, IMHO, is that using RET_IP only cannot identify the actual allocation site in case of kmalloc being called inside wrappers or helper functions. And I believe there could be similar cases in other functions. Nevertheless, I haven't thought of any good solution for this. So for now let's solve this specific case first. For __kvmalloc_node_noprof, replace __kmalloc_node_noprof with an inline version, so that RET_IP can take the return address of kvmalloc and differentiate each kvmalloc invocation: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __kmalloc_node_inline(.., _RET_IP_) ... <-- _RET_IP_ points to Thanks to Tamás Koczka for the report and discussion! Links: [1] https://github.com/google/security-research/pull/83/files#diff-1604319b55a48c39a210ee52034ed7ff5b9cdc3d704d2d9e34eb230d19fae235R200 Signed-off-by: GONG Ruiqi --- include/linux/slab.h | 3 +++ mm/slub.c| 7 +++ mm/util.c| 4 ++-- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 10a971c2bde3..e03ca4a95511 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -834,6 +834,9 @@ void *__kmalloc_large_noprof(size_t size, gfp_t flags) void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node) __assume_page_alignment __alloc_size(1); +void *__kmalloc_node_inline(size_t size, kmem_buckets *b, gfp_t flags, + int node, unsigned long caller); + /** * kmalloc - allocate kernel memory * @size: how many bytes of memory are required. diff --git a/mm/slub.c b/mm/slub.c index c2151c9fee22..ec75070345c6 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4319,6 +4319,13 @@ void *__kmalloc_node_track_caller_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flag } EXPORT_SYMBOL(__kmalloc_node_track_caller_noprof); +__always_inline void *__kmalloc_node_inline(size_t size, kmem_buckets *b, + gfp_t flags, int node, + unsigned long caller) +{ + return __do_kmalloc_node(size, b, flags, node, caller); +} + void *__kmalloc_cache_noprof(struct kmem_cache *s, gfp_t gfpflags, size_t size) { void *ret = slab_alloc_node(s, NULL, gfpflags, NUMA_NO_NODE, diff --git a/mm/util.c b/mm/util.c index 60aa40f612b8..3910d1d1f595 100644 --- a/mm/util.c +++ b/mm/util.c @@ -642,9 +642,9 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) * It doesn't really make sense to fallback to vmalloc for sub page * requests */ - ret = __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, b), + ret = __kmalloc_node_inline(size, PASS_BUCKET_PARAM(b), kmalloc_gfp_adjust(flags, size), - node); + node, _RET_IP_); if (ret || size <= PAGE_SIZE) return ret; -- 2.25.1
Re: [PATCH] mm/slab: Achieve better kmalloc caches randomization in kvmalloc
On 2025/01/24 23:19, Vlastimil Babka wrote: > On 1/22/25 17:02, Christoph Lameter (Ampere) wrote: >> On Wed, 22 Jan 2025, GONG Ruiqi wrote: >> >>> >>> +void *__kmalloc_node_inline(size_t size, kmem_buckets *b, gfp_t flags, >>> + int node, unsigned long caller); >>> + >> >> >> Huh? Is this inline? Where is the body of the function? >> >>> diff --git a/mm/slub.c b/mm/slub.c >>> index c2151c9fee22..ec75070345c6 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -4319,6 +4319,13 @@ void >>> *__kmalloc_node_track_caller_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flag >>> } >>> EXPORT_SYMBOL(__kmalloc_node_track_caller_noprof); >>> >>> +__always_inline void *__kmalloc_node_inline(size_t size, kmem_buckets *b, >>> + gfp_t flags, int node, >>> + unsigned long caller) >>> +{ >>> + return __do_kmalloc_node(size, b, flags, node, caller); >>> +} >>> + >> >> inline functions need to be defined in the header file AFAICT. > > Yeah, this could possibly inline only with LTO (dunno if it does). But the > real difference is passing __kvmalloc_node_noprof()'s _RET_IP_ as caller. > > Maybe instead of this new wrapper we could just move > __kvmalloc_node_noprof() to slub.c and access __do_kmalloc_node() directly. > For consistency also kvfree() and whatever necessary dependencies. The > placement in util.c is kinda weird anyway and IIRC we already moved > krealloc() due to needing deeper involvement with slab internals. The > vmalloc part of kvmalloc/kvfree is kinda a self-contained fallback that can > be just called from slub.c as well as from util.c. Thanks for the advice! I will send a V2 based on moving __kvmalloc_node_noprof() and kvfree() to slub.c as soon as possible. BR, Ruiqi
[PATCH v3 2/2] slab: Achieve better kmalloc caches randomization in kvmalloc
As revealed by this writeup[1], due to the fact that __kmalloc_node (now renamed to __kmalloc_node_noprof) is an exported symbol and will never get inlined, using it in kvmalloc_node (now is __kvmalloc_node_noprof) would make the RET_IP inside always point to the same address: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __kmalloc_node_noprof __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to That literally means all kmalloc invoked via kvmalloc would use the same seed for cache randomization (CONFIG_RANDOM_KMALLOC_CACHES), which makes this hardening non-functional. The root cause of this problem, IMHO, is that using RET_IP only cannot identify the actual allocation site in case of kmalloc being called inside non-inlined wrappers or helper functions. And I believe there could be similar cases in other functions. Nevertheless, I haven't thought of any good solution for this. So for now let's solve this specific case first. For __kvmalloc_node_noprof, replace __kmalloc_node_noprof and call __do_kmalloc_node directly instead, so that RET_IP can take the return address of kvmalloc and differentiate each kvmalloc invocation: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to Thanks to Tamás Koczka for the report and discussion! Link: https://github.com/google/security-research/blob/908d59b573960dc0b90adda6f16f7017aca08609/pocs/linux/kernelctf/CVE-2024-27397_mitigation/docs/exploit.md?plain=1#L259 [1] Reported-by: Tamás Koczka Signed-off-by: GONG Ruiqi --- mm/slub.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index abc982d68feb..1f7d1d260eeb 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4925,9 +4925,9 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) * It doesn't really make sense to fallback to vmalloc for sub page * requests */ - ret = __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, b), - kmalloc_gfp_adjust(flags, size), - node); + ret = __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), + kmalloc_gfp_adjust(flags, size), + node, _RET_IP_); if (ret || size <= PAGE_SIZE) return ret; -- 2.25.1
[PATCH v3 1/2] slab: Adjust placement of __kvmalloc_node_noprof
Move __kvmalloc_node_noprof (as well as kvfree*, kvrealloc_noprof and kmalloc_gfp_adjust for consistency) into mm/slub.c so that it can directly invoke __do_kmalloc_node, which is needed for the next patch. No functional changes intended. Signed-off-by: GONG Ruiqi --- mm/slub.c | 162 ++ mm/util.c | 162 -- 2 files changed, 162 insertions(+), 162 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 1f50129dcfb3..abc982d68feb 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4878,6 +4878,168 @@ void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags) } EXPORT_SYMBOL(krealloc_noprof); +static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size) +{ + /* +* We want to attempt a large physically contiguous block first because +* it is less likely to fragment multiple larger blocks and therefore +* contribute to a long term fragmentation less than vmalloc fallback. +* However make sure that larger requests are not too disruptive - no +* OOM killer and no allocation failure warnings as we have a fallback. +*/ + if (size > PAGE_SIZE) { + flags |= __GFP_NOWARN; + + if (!(flags & __GFP_RETRY_MAYFAIL)) + flags |= __GFP_NORETRY; + + /* nofail semantic is implemented by the vmalloc fallback */ + flags &= ~__GFP_NOFAIL; + } + + return flags; +} + +/** + * __kvmalloc_node - attempt to allocate physically contiguous memory, but upon + * failure, fall back to non-contiguous (vmalloc) allocation. + * @size: size of the request. + * @b: which set of kmalloc buckets to allocate from. + * @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL. + * @node: numa node to allocate from + * + * Uses kmalloc to get the memory but if the allocation fails then falls back + * to the vmalloc allocator. Use kvfree for freeing the memory. + * + * GFP_NOWAIT and GFP_ATOMIC are not supported, neither is the __GFP_NORETRY modifier. + * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is + * preferable to the vmalloc fallback, due to visible performance drawbacks. + * + * Return: pointer to the allocated memory of %NULL in case of failure + */ +void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) +{ + void *ret; + + /* +* It doesn't really make sense to fallback to vmalloc for sub page +* requests +*/ + ret = __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, b), + kmalloc_gfp_adjust(flags, size), + node); + if (ret || size <= PAGE_SIZE) + return ret; + + /* non-sleeping allocations are not supported by vmalloc */ + if (!gfpflags_allow_blocking(flags)) + return NULL; + + /* Don't even allow crazy sizes */ + if (unlikely(size > INT_MAX)) { + WARN_ON_ONCE(!(flags & __GFP_NOWARN)); + return NULL; + } + + /* +* kvmalloc() can always use VM_ALLOW_HUGE_VMAP, +* since the callers already cannot assume anything +* about the resulting pointer, and cannot play +* protection games. +*/ + return __vmalloc_node_range_noprof(size, 1, VMALLOC_START, VMALLOC_END, + flags, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP, + node, __builtin_return_address(0)); +} +EXPORT_SYMBOL(__kvmalloc_node_noprof); + +/** + * kvfree() - Free memory. + * @addr: Pointer to allocated memory. + * + * kvfree frees memory allocated by any of vmalloc(), kmalloc() or kvmalloc(). + * It is slightly more efficient to use kfree() or vfree() if you are certain + * that you know which one to use. + * + * Context: Either preemptible task context or not-NMI interrupt. + */ +void kvfree(const void *addr) +{ + if (is_vmalloc_addr(addr)) + vfree(addr); + else + kfree(addr); +} +EXPORT_SYMBOL(kvfree); + +/** + * kvfree_sensitive - Free a data object containing sensitive information. + * @addr: address of the data object to be freed. + * @len: length of the data object. + * + * Use the special memzero_explicit() function to clear the content of a + * kvmalloc'ed object containing sensitive data to make sure that the + * compiler won't optimize out the data clearing. + */ +void kvfree_sensitive(const void *addr, size_t len) +{ + if (likely(!ZERO_OR_NULL_PTR(addr))) { + memzero_explicit((void *)addr, len); + kvfree(addr); + } +} +EXPORT_SYMBOL(kvfree_sensitive); + +/** + * kvrealloc - reallocate memory; contents remain unchanged + * @p: object to reallocate memory for + * @size: the size to reallocate + * @flags: the flags for the page level allocator + * + * If
[PATCH v3 0/2] Refine kmalloc caches randomization in kvmalloc
Hi, v3: - move all the way from kmalloc_gfp_adjust to kvrealloc_noprof into mm/slub.c - some rewording for commit logs v2: https://lore.kernel.org/all/20250208014723.1514049-1-gongrui...@huawei.com/ - change the implementation as Vlastimil suggested v1: https://lore.kernel.org/all/20250122074817.991060-1-gongrui...@huawei.com/ Tamás reported [1] that kmalloc cache randomization doesn't actually work for those kmalloc invoked via kvmalloc. For more details, see the commit log of patch 2. The current solution requires a direct call from __kvmalloc_node_noprof to __do_kmalloc_node, a static function in a different .c file. As suggested by Vlastimil [2], it's achieved by simply moving __kvmalloc_node_noprof from mm/util.c to mm/slub.c, together with some other functions of the same family. Link: https://github.com/google/security-research/blob/908d59b573960dc0b90adda6f16f7017aca08609/pocs/linux/kernelctf/CVE-2024-27397_mitigation/docs/exploit.md?plain=1#L259 [1] Link: https://lore.kernel.org/all/62044279-0c56-4185-97f7-7afac65ff...@suse.cz/ [2] GONG Ruiqi (2): slab: Adjust placement of __kvmalloc_node_noprof slab: Achieve better kmalloc caches randomization in kvmalloc mm/slub.c | 162 ++ mm/util.c | 162 -- 2 files changed, 162 insertions(+), 162 deletions(-) -- 2.25.1