On 2024/4/9 21:49, Masami Hiramatsu (Google) wrote:
On Tue, 9 Apr 2024 14:20:45 +0800
Zheng Yejian <zhengyeji...@huawei.com> wrote:

On 2024/4/8 20:41, Masami Hiramatsu (Google) wrote:
Hi Zheng,

On Mon, 8 Apr 2024 16:34:03 +0800
Zheng Yejian <zhengyeji...@huawei.com> wrote:

There is once warn in __arm_kprobe_ftrace() on:

   ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 0, 0);
   if (WARN_ONCE(..., "Failed to arm kprobe-ftrace at %pS (error %d)\n", ...)
     return ret;

This warning is generated because 'p->addr' is detected to be not a valid
ftrace location in ftrace_set_filter_ip(). The ftrace address check is done
by check_ftrace_location() at the beginning of check_kprobe_address_safe().
At that point, ftrace_location(addr) == addr should return true if the
module is loaded. Then the module is searched twice:
    1. in is_module_text_address(), we find that 'p->addr' is in a module;
    2. in __module_text_address(), we find the module;

If the module has just been unloaded before the second search, then
'*probed_mod' is NULL and we would not go to get the module refcount,
then the return value of check_kprobe_address_safe() would be 0, but
actually we need to return -EINVAL.

OK, so you found a race window in check_kprobe_address_safe().

It does something like below.

check_kprobe_address_safe() {
        ...

        /* Timing [A] */

        if (!(core_kernel_text(p->addr) ||
                is_module_text_address(p->addr)) ||
                ...(other reserved address check)) {
                return -EINVAL;
        }

        /* Timing [B] */

        *probed_mod = __module_text_address(p->addr):
        if (*probe_mod) {
                if (!try_module_get(*probed_mod)) {
                        return -ENOENT;
                }
                ...     
        }
}

So, if p->addr is in a module which is alive at the timing [A], but
unloaded at timing [B], 'p->addr' is passed the
'is_module_text_address(p->addr)' check, but *probed_mod becomes NULL.
Thus the corresponding module is not referenced and kprobe_arm(p) will
access a wrong address (use after free).
This happens either kprobe on ftrace is enabled or not.

Yes, This is the problem. And for this case, check_kprobe_address_safe()
still return 0, and then going on to arm kprobe may cause problems. So
we should make check_kprobe_address_safe() return -EINVAL when refcount
of the module is not got.

Yes,



To fix this problem, we should move the mutex_lock(kprobe_mutex) before
check_kprobe_address_safe() because kprobe_module_callback() also lock it
so it can stop module unloading.

Can you ensure this will fix your problem?

It seems not, the warning in __arm_kprobe_ftrace() still occurs. I
contrived following simple test:

      #!/bin/bash
      sysctl -w kernel.panic_on_warn=1
      while [ True ]; do
          insmod mod.ko    # contain function 'foo'
          rmmod mod.ko
      done &
      while [ True ]; do
          insmod kprobe.ko  # register kprobe on function 'foo'
          rmmod kprobe.ko
      done &

I think holding kprobe_mutex cannot make sure we get the refcount of the
module.

Aah, yes, it cannot, because the kallsyms in a module will be removed
after module->state becomes MODULE_STATE_UNFORMED. Before UNFORMED,
the state is MODULE_STATE_GOING and the kprobe_module_callback() is
called at that point. Thus, the following scenario happens.

              CPU1                                      CPU2

mod->state = MODULE_STATE_GOING
kprobe_module_callback() {
        mutex_lock(&kprobe_mutex)
                loop on kprobe_table
                to disable kprobe in the module.
        mutex_unlock(&kprobe_mutex)
}
                                                register_kprobe(p) {
                                                        
mutex_lock(&kprobe_mutex)
                                                        
check_kprobe_address_safe(p->addr) {
                                                                [A'']
                                                                
is_module_text_address() return true
                                                                until 
mod->state == UNFORMED.
mod->state = MODULE_STATE_UNFORMED
                                                                [B'']
                                                                
__module_text_address() returns NULL.
                                                        }
                                                        p is on the 
kprobe_table.
                                                        
mutex_unlock(&kprobe_mutex)

So, as your fix, if we save the module at [A''] and use it at [B''],
the mod is NOT able to get because mod->state != MODULE_STATE_LIVE.



I think your patch is just optimizing but not fixing the fundamental
problem, which is we don't have an atomic search symbol and get module

Sorry, this patch is a little confusing, but it is not just optimizing :)

As shown below, after my patch, if p->addr is in a module which is alive
at the timing [A'] but unloaded at timing [B'], then *probed_mod must
not be NULL. Then after timing [B'], it will go to try_module_get() and
expected to fail and return -ENOENT. So this is the different.

      check_kprobe_address_safe() {
          ...
          *probed_mod = NULL;
          if (!core_kernel_text((unsigned long) p->addr)) {

              /* Timing [A'] */

              *probed_mod = __module_text_address((unsigned long) p->addr);
              if (!(*probed_mod)) {
                  return -EINVAL;
              }
          }
          ...

          /* Timing [B'] */

          if (*probed_mod) {
              if (!try_module_get(*probed_mod)) {
                  return -ENOENT;
              }
              ...
          }

OK, I got it. Hmm, but this is a bit long story to explain, the
root cause is the delay of module unloading process. So more
precisely, we can explain it as below.

----
When unloading a module, its state is changing MODULE_STATE_LIVE ->
  MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take
a time. `is_module_text_address()` and `__module_text_address()`
works with MODULE_STATE_LIVE and MODULE_STATE_GOING.
If we use `is_module_text_address()` and `__module_text_address()`
separately, there is a chance that the first one is succeeded but the
next one is failed because module->state becomes MODULE_STATE_UNFORMED
between those operations.

In `check_kprobe_address_safe()`, if the second `__module_text_address()`
is failed, that is ignored because it expected a kernel_text address.
But it may have failed simply because module->state has been changed
to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify
non-exist module text address (use-after-free).

To fix this problem, we should not use separated `is_module_text_address()`
and `__module_text_address()`, but use only `__module_text_address()` once
and do `try_module_get(module)` which is only available with
MODULE_STATE_LIVE.
----

Would it be good for you too? The code itself looks good to me now :-)

Yes, of course :)


Thank you!


API. In that case, we should stop a whole module unloading system until
registering a new kprobe on a module. (After registering the kprobe,
the callback can mark it gone and disarm_kprobe does not work anymore.)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9d9095e81792..94eaefd1bc51 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1633,11 +1633,11 @@ int register_kprobe(struct kprobe *p)
        p->nmissed = 0;
        INIT_LIST_HEAD(&p->list);
+ mutex_lock(&kprobe_mutex);
+
        ret = check_kprobe_address_safe(p, &probed_mod);
        if (ret)
-               return ret;
-
-       mutex_lock(&kprobe_mutex);
+               goto out;
if (on_func_entry)
                p->flags |= KPROBE_FLAG_ON_FUNC_ENTRY;

----

Thank you,


To fix it, originally we can simply check 'p->addr' is out of text again,
like below. But that would check twice respectively in kernel text and
module text, so finally I reduce them to be once.

    if (!(core_kernel_text((unsigned long) p->addr) ||
        is_module_text_address((unsigned long) p->addr)) || ...) {
        ret = -EINVAL;
        goto out;
    }
    ...
    *probed_mod = __module_text_address((unsigned long) p->addr);
    if (*probed_mod) {
        ...
    } else if (!core_kernel_text((unsigned long) p->addr)) { // check again!
        ret = -EINVAL;
        goto out;
    }

Signed-off-by: Zheng Yejian <zhengyeji...@huawei.com>
---
   kernel/kprobes.c | 18 ++++++++++++------
   1 file changed, 12 insertions(+), 6 deletions(-)

v2:
   - Update commit messages and comments as suggested by Masami.
     Link: 
https://lore.kernel.org/all/20240408115038.b0c85767bf1f249eccc32...@kernel.org/

v1:
   - Link: 
https://lore.kernel.org/all/20240407035904.2556645-1-zhengyeji...@huawei.com/

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9d9095e81792..65adc815fc6e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p,
        jump_label_lock();
        preempt_disable();
- /* Ensure it is not in reserved area nor out of text */
-       if (!(core_kernel_text((unsigned long) p->addr) ||
-           is_module_text_address((unsigned long) p->addr)) ||
-           in_gate_area_no_mm((unsigned long) p->addr) ||
+       /* Ensure the address is in a text area, and find a module if exists. */
+       *probed_mod = NULL;
+       if (!core_kernel_text((unsigned long) p->addr)) {
+               *probed_mod = __module_text_address((unsigned long) p->addr);
+               if (!(*probed_mod)) {
+                       ret = -EINVAL;
+                       goto out;
+               }
+       }
+       /* Ensure it is not in reserved area. */
+       if (in_gate_area_no_mm((unsigned long) p->addr) ||
            within_kprobe_blacklist((unsigned long) p->addr) ||
            jump_label_text_reserved(p->addr, p->addr) ||
            static_call_text_reserved(p->addr, p->addr) ||
@@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
                goto out;
        }
- /* Check if 'p' is probing a module. */
-       *probed_mod = __module_text_address((unsigned long) p->addr);
+       /* Get module refcount and reject __init functions for loaded modules. 
*/
        if (*probed_mod) {
                /*
                 * We must hold a refcount of the probed module while updating
--
2.25.1


--
Thanks
Zheng Yejian







Reply via email to