On 2015/12/16 1:37, Steven Rostedt wrote:
> On Tue, 15 Dec 2015 11:26:41 +0800
> "Zhang, Yanmin" <yanmin_zh...@linux.intel.com> wrote:
>
>>> This seems very hackish, although I can't think of a better way at the
>>> moment. But I would like not to add more code into module.c if
>>> possible, and just use a notifier unless there's a real reason we can't
>>> (as there was when we added the hook in the first place).
>>> We would double-check it again. It's not a good idea to change common
>>> module codes.  
>> I double-checked it. If using notifier, we have to add 2 new module 
>> notification
>> events in enum module_state.
>> For example, MODULE_STATE_INITING, MODULE_STATE_INITING_FAILED.
>> At MODULE_STATE_INITING, call ftrace_module_init(mod);
>> At MODULE_STATE_INITING_FAILED, call ftrace_release_mod (or similar 
>> function);
>> At MODULE_STATE_COMING, call a new function which links the new start_pg
>> to ftrace_pages.
>> Such design can also fix another bug in current kernel that ftrace start_pg
>> of the module is not cleaned up if complete_formation fails.
>> However, we change module common codes. The new events only serve ftrace.
>>
>> If not holding ftrace_mutex across start_pg setup and complete_formation,
>> some other variables are not protected well, such like ftrace_pages, 
>> ftrace_start_up, and so on.
> OK, that was basically the reason the hook was added in the first place.
>
> The reason the modules notifier doesn't work here is that there's too
> many exits from the module code that may leave the mutex held.
>
> I thought about this some more. So the issue is that we add the module
> functions to the ftrace records. Then another task enables function
> tracing before the module is fully set up. As ftrace is enabling
> function tracing, the module sets its text to read-only, outside the
> scope of ftrace setting all text to read-write. When ftrace gets to the
> module code, it fails to do the modifications and you get the
> ftrace_bug() splat.
>
> Sounds right?
>
> Now, I really dislike the taking of the ftrace mutex and returning back
> to module handling. That is very subtle, fragile and prone to bugs. I
> took a step back to find another solution and I think I found one.
>
> Take the below patch and add to it (I'll keep this patch as mine, and
> you can submit another patch to do the following). You don't need to
> resend this patch, just send me a patch that does this:
>
>
> Add the module notifier that calls ftrace_module_enable(mod), removing
> it from ftrace_init_module(). Feel free to monkey with that function to
> be able to be called directly if it can't already.
>
> What my patch does is to create a new ftrace record flag DISABLED,
> which is set to all functions in a module as it is added. Then later on
> (in the module notifier that you will add), the flag is cleared. In
> between, the module functions will be ignored.
>
>
> If the module errors out and the notifier is not called, we don't care.
> The records will simply be removed, and the flag will be meaningless.
>
> I needed to move the code around to still let the module code get
> enabled if tracing was already enabled. I even cleaned that code up as
> I found it has gotten a bit messy.
>
> -- Steve
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 4736a826baf5..660e7c698f3b 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -357,6 +357,7 @@ bool is_ftrace_trampoline(unsigned long addr);
>   *  REGS    - the record wants the function to save regs
>   *  REGS_EN - the function is set up to save regs.
>   *  IPMODIFY - the record allows for the IP address to be changed.
> + *  DISABLED - the record is not ready to be touched yet
>   *
>   * When a new ftrace_ops is registered and wants a function to save
>   * pt_regs, the rec->flag REGS is set. When the function has been
> @@ -371,10 +372,11 @@ enum {
>       FTRACE_FL_TRAMP         = (1UL << 28),
>       FTRACE_FL_TRAMP_EN      = (1UL << 27),
>       FTRACE_FL_IPMODIFY      = (1UL << 26),
> +     FTRACE_FL_DISABLED      = (1UL << 25),
>  };
>  
> -#define FTRACE_REF_MAX_SHIFT 26
> -#define FTRACE_FL_BITS               6
> +#define FTRACE_REF_MAX_SHIFT 25
> +#define FTRACE_FL_BITS               7
>  #define FTRACE_FL_MASKED_BITS        ((1UL << FTRACE_FL_BITS) - 1)
>  #define FTRACE_FL_MASK               (FTRACE_FL_MASKED_BITS << 
> FTRACE_REF_MAX_SHIFT)
>  #define FTRACE_REF_MAX               ((1UL << FTRACE_REF_MAX_SHIFT) - 1)
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index e290a30f2d0b..e7e15874fb7c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2023,6 +2023,9 @@ static int ftrace_check_record(struct dyn_ftrace *rec, 
> int enable, int update)
>  
>       ftrace_bug_type = FTRACE_BUG_UNKNOWN;
>  
> +     if (rec->flags & FTRACE_FL_DISABLED)
> +             return FTRACE_UPDATE_IGNORE;
> +
>       /*
>        * If we are updating calls:
>        *
> @@ -2849,64 +2852,41 @@ ops_references_rec(struct ftrace_ops *ops, struct 
> dyn_ftrace *rec)
>       return 1;
>  }
>  
> -static int referenced_filters(struct dyn_ftrace *rec)
> -{
> -     struct ftrace_ops *ops;
> -     int cnt = 0;
> -
> -     for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
> -             if (ops_references_rec(ops, rec))
> -                 cnt++;
> -     }
> -
> -     return cnt;
> -}
> -
>  static int ftrace_update_code(struct module *mod, struct ftrace_page 
> *new_pgs)
>  {
>       struct ftrace_page *pg;
>       struct dyn_ftrace *p;
>       cycle_t start, stop;
>       unsigned long update_cnt = 0;
> -     unsigned long ref = 0;
> -     bool test = false;
> +     unsigned long rec_flags = 0;
>       int i;
>  
> +     start = ftrace_now(raw_smp_processor_id());
> +
>       /*
> -      * When adding a module, we need to check if tracers are
> -      * currently enabled and if they are set to trace all functions.
> -      * If they are, we need to enable the module functions as well
> -      * as update the reference counts for those function records.
> +      * When a module is loaded, this function is called to convert
> +      * the calls to mcount in its text to nops, and also to create
> +      * an entry in the ftrace data. Now, if ftrace is activated
> +      * after this call, but before the module sets its text to
> +      * read-only, the modification of enabling ftrace can fail if
> +      * the read-only is done while ftrace is converting the calls.
> +      * To prevent this, the module's records are set as disabled
> +      * and will be enabled after the call to set the module's text
> +      * to read-only.
>        */
> -     if (mod) {
> -             struct ftrace_ops *ops;
> -
> -             for (ops = ftrace_ops_list;
> -                  ops != &ftrace_list_end; ops = ops->next) {
> -                     if (ops->flags & FTRACE_OPS_FL_ENABLED) {
> -                             if (ops_traces_mod(ops))
> -                                     ref++;
> -                             else
> -                                     test = true;
> -                     }
> -             }
> -     }
> -
> -     start = ftrace_now(raw_smp_processor_id());
> +     if (mod)
> +             rec_flags |= FTRACE_FL_DISABLED;
>  
>       for (pg = new_pgs; pg; pg = pg->next) {
>  
>               for (i = 0; i < pg->index; i++) {
> -                     int cnt = ref;
>  
>                       /* If something went wrong, bail without enabling 
> anything */
>                       if (unlikely(ftrace_disabled))
>                               return -1;
>  
>                       p = &pg->records[i];
> -                     if (test)
> -                             cnt += referenced_filters(p);
> -                     p->flags = cnt;
> +                     p->flags = rec_flags;
>  
>                       /*
>                        * Do the initial record conversion from mcount jump
> @@ -2916,21 +2896,6 @@ static int ftrace_update_code(struct module *mod, 
> struct ftrace_page *new_pgs)
>                               break;
>  
>                       update_cnt++;
> -
> -                     /*
> -                      * If the tracing is enabled, go ahead and enable the 
> record.
> -                      *
> -                      * The reason not to enable the record immediatelly is 
> the
> -                      * inherent check of ftrace_make_nop/ftrace_make_call 
> for
> -                      * correct previous instructions.  Making first the NOP
> -                      * conversion puts the module to the correct state, thus
> -                      * passing the ftrace_make_call check.
> -                      */
> -                     if (ftrace_start_up && cnt) {
> -                             int failed = __ftrace_replace_code(p, 1);
> -                             if (failed)
> -                                     ftrace_bug(failed, p);
> -                     }
>               }
>       }
>  
> @@ -4938,6 +4903,19 @@ static int ftrace_process_locs(struct module *mod,
>  
>  #define next_to_ftrace_page(p) container_of(p, struct ftrace_page, next)
>  
> +static int referenced_filters(struct dyn_ftrace *rec)
> +{
> +     struct ftrace_ops *ops;
> +     int cnt = 0;
> +
> +     for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
> +             if (ops_references_rec(ops, rec))
> +                 cnt++;
> +     }
> +
> +     return cnt;
> +}
> +
>  void ftrace_release_mod(struct module *mod)
>  {
>       struct dyn_ftrace *rec;
> @@ -4980,12 +4958,82 @@ void ftrace_release_mod(struct module *mod)
>       mutex_unlock(&ftrace_lock);
>  }
>  
> +static void ftrace_module_enable(struct module *mod)
> +{
> +     struct dyn_ftrace *rec;
> +     struct ftrace_page *pg;
> +
> +     mutex_lock(&ftrace_lock);
> +
> +     if (ftrace_disabled)
> +             goto out_unlock;
> +
> +     /*
> +      * If the tracing is enabled, go ahead and enable the record.
> +      *
> +      * The reason not to enable the record immediatelly is the
> +      * inherent check of ftrace_make_nop/ftrace_make_call for
> +      * correct previous instructions.  Making first the NOP
> +      * conversion puts the module to the correct state, thus
> +      * passing the ftrace_make_call check.
> +      *
> +      * We also delay this to after the module code already set the
> +      * text to read-only, as we now need to set it back to read-write
> +      * so that we can modify the text.
> +      */
> +     if (ftrace_start_up)
> +             ftrace_arch_code_modify_prepare();
> +
> +     do_for_each_ftrace_rec(pg, rec) {
> +             int cnt;
> +             /*
> +              * do_for_each_ftrace_rec() is a double loop.
> +              * module text shares the pg. If a record is
> +              * not part of this module, then skip this pg,
> +              * which the "break" will do.
> +              */
> +             if (!within_module_core(rec->ip, mod))
> +                     break;
> +
> +             cnt = 0;
> +
> +             /*
> +              * When adding a module, we need to check if tracers are
> +              * currently enabled and if they are, and can trace this record,
> +              * we need to enable the module functions as well as update the
> +              * reference counts for those function records.
> +              */
> +             if (ftrace_start_up)
> +                     cnt += referenced_filters(rec);
> +
> +             /* This clears FTRACE_FL_DISABLED */
> +             rec->flags = cnt;
> +
> +             if (ftrace_start_up && cnt) {
> +                     int failed = __ftrace_replace_code(rec, 1);

If we choose to call ftrace_module_enable when receiving module notification
  MODULE_STATE_COMING, TEXT section of the module is already changed to RO.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to