hotblocks: migrate to new per_vcpu API

Luc Michel Fri, 01 Mar 2024 05:27:36 -0800

On 14:33 Thu 29 Feb     , Alex Bennée wrote:
> Caution: This message originated from an External Source. Use proper caution 
> when opening attachments, clicking links, or responding.
> 
> 
> Luc Michel <luc.mic...@amd.com> writes:
> 
> > On 15:09 Tue 27 Feb     , Pierrick Bouvier wrote:
> >> On 2/27/24 2:54 PM, Luc Michel wrote:
> >> > Hi Pierrick,
> >> >
> >> > On 13:14 Mon 26 Feb     , Pierrick Bouvier wrote:
> >> > > Signed-off-by: Pierrick Bouvier <pierrick.bouv...@linaro.org>
> >> > > ---
> >> > >   contrib/plugins/hotblocks.c | 50 
> >> > > ++++++++++++++++++++++---------------
> >> > >   1 file changed, 30 insertions(+), 20 deletions(-)
> >> > >
> >> > > diff --git a/contrib/plugins/hotblocks.c b/contrib/plugins/hotblocks.c
> >> > > index 4de1b134944..02bc5078bdd 100644
> >> > > --- a/contrib/plugins/hotblocks.c
> >> > > +++ b/contrib/plugins/hotblocks.c
> >> > > @@ -34,8 +34,8 @@ static guint64 limit = 20;
> >> > >    */
> >> > >   typedef struct {
> >> > >       uint64_t start_addr;
> >> > > -    uint64_t exec_count;
> >> > > -    int      trans_count;
> >> > > +    struct qemu_plugin_scoreboard *exec_count;
> >> > > +    int trans_count;
> >> > >       unsigned long insns;
> >> > >   } ExecCount;
> >> > >
> >> > > @@ -43,7 +43,17 @@ static gint cmp_exec_count(gconstpointer a, 
> >> > > gconstpointer b)
> >> > >   {
> >> > >       ExecCount *ea = (ExecCount *) a;
> >> > >       ExecCount *eb = (ExecCount *) b;
> >> > > -    return ea->exec_count > eb->exec_count ? -1 : 1;
> >> > > +    uint64_t count_a =
> >> > > +        
> >> > > qemu_plugin_u64_sum(qemu_plugin_scoreboard_u64(ea->exec_count));
> >> > > +    uint64_t count_b =
> >> > > +        
> >> > > qemu_plugin_u64_sum(qemu_plugin_scoreboard_u64(eb->exec_count));
> >> > > +    return count_a > count_b ? -1 : 1;
> >> > > +}
> >> > > +
> >> > > +static void exec_count_free(gpointer key, gpointer value, gpointer 
> >> > > user_data)
> >> > > +{
> >> > > +    ExecCount *cnt = value;
> >> > > +    qemu_plugin_scoreboard_free(cnt->exec_count);
> >> > >   }
> >> > >
> >> > >   static void plugin_exit(qemu_plugin_id_t id, void *p)
> >> > > @@ -52,7 +62,6 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
> >> > >       GList *counts, *it;
> >> > >       int i;
> >> > >
> >> > > -    g_mutex_lock(&lock);
> >> >
> >> > I encountered cases before where the vCPUs continue executing while
> >> > plugin_exit is called. This can happen e.g., when QEMU calls exit(3)
> >> > from one CPU thread. Others will continue to run at the same time the
> >> > atexit callbacks are called.
> >> >
> >> > This also means that you can't really free the resources as you do at
> >> > the end of plugin_exit.
> >> >
> >>
> >> Interesting...
> >>
> >> The current documentation [1] mentions it's the right place to free
> >> resources, and from what I saw, existing plugins assume this too (see
> >> contrib/plugins/cache.c for instance).
> >>
> >> There is probably a bug related to the case you mention, and I'll try to
> >> reproduce this, and see if we can have a proper fix.
> >>
> >> I'm not sure that removing cleanup code from existing plugins is the
> >> right thing to do at the moment, even though there is an existing issue
> >> with some programs.
> >
> > Yes absolutely. The problem is on the QEMU side. FWIW I tried to fix one
> > of those exit cases (semihosted exit syscall) some time ago:
> > https://lore.kernel.org/qemu-devel/20220621125916.25257-1-lmic...@kalray.eu/
> 
> The plugin exit handler should flush all instrumented code:
> 
>   /*
>    * Handle exit from linux-user. Unlike the normal atexit() mechanism
>    * we need to handle the clean-up manually as it's possible threads
>    * are still running. We need to remove all callbacks from code
>    * generation, flush the current translations and then we can safely
>    * trigger the exit callbacks.
>    */
> 
>   void qemu_plugin_user_exit(void)
>   {
>       enum qemu_plugin_event ev;
>       CPUState *cpu;
> 
>       /*
>        * Locking order: we must acquire locks in an order that is consistent
>        * with the one in fork_start(). That is:
>        * - start_exclusive(), which acquires qemu_cpu_list_lock,
>        *   must be called before acquiring plugin.lock.
>        * - tb_flush(), which acquires mmap_lock(), must be called
>        *   while plugin.lock is not held.
>        */
>       start_exclusive();
> 
>       qemu_rec_mutex_lock(&plugin.lock);
>       /* un-register all callbacks except the final AT_EXIT one */
>       for (ev = 0; ev < QEMU_PLUGIN_EV_MAX; ev++) {
>           if (ev != QEMU_PLUGIN_EV_ATEXIT) {
>               struct qemu_plugin_cb *cb, *next;
> 
>               QLIST_FOREACH_SAFE_RCU(cb, &plugin.cb_lists[ev], entry, next) {
>                   plugin_unregister_cb__locked(cb->ctx, ev);
>               }
>           }
>       }
>       CPU_FOREACH(cpu) {
>           qemu_plugin_disable_mem_helpers(cpu);
>       }
>       qemu_rec_mutex_unlock(&plugin.lock);
> 
>       tb_flush(current_cpu);
>       end_exclusive();
> 
>       /* now it's safe to handle the exit case */
>       qemu_plugin_atexit_cb();
>   }
> 
> So you shouldn't see any other instrumentation activity after the
> end_exclusive. However you may see another thread hit its exist handler
> before we complete exiting the system.


OK I was not aware of this. Anyway I encountered those issue a long time
ago. It's probably not relevant anymore.

Thanks

-- 
Luc

> 
> --
> Alex Bennée
> Virtualisation Tech Lead @ Linaro

--

Re: [PATCH v5 09/12] contrib/plugins/hotblocks: migrate to new per_vcpu API

Reply via email to