cache: Fixed a use-after-free bug with multithreaded usermode

Alex Bennée Mon, 19 Jul 2021 05:50:45 -0700


Mahmoud Mandour <ma.mando...@gmail.com> writes:


> On Mon, Jul 19, 2021 at 1:08 PM Alex Bennée <alex.ben...@linaro.org> wrote:
>
>  Mahmoud Mandour <ma.mando...@gmail.com> writes:
>
>  > On Mon, Jul 19, 2021 at 11:48 AM Alex Bennée <alex.ben...@linaro.org> 
> wrote:
>  >
>  >  Mahmoud Mandour <ma.mando...@gmail.com> writes:
>  >
>  >  > Since callbacks may be interleaved because of multithreaded execution,
>  >  > we should not make assumptions about `plugin_exit` either. The problem
>  >  > with `plugin_exit` is that it frees shared data structures (caches and
>  >  > `miss_ht` hash table). It should not be assumed that callbacks will not
>  >  > be called after it and hence use already-freed data structures.
>  >
>  >  What was your test case for this because I wonder if it would be worth
>  >  coming up with one for check-tcg? 
>  >
>  > I think just any ad-hoc multithreaded execution will evoke the race pretty 
> much 
>  > consistently.
>
>  I haven't managed to trigger it with testthread but of course my
>  libcache is trying to to defend against it.
>
> https://pastebin.com/X4Xazemh
> That's a minimal program that reproduces the bug consistently for me (to my 
> observation, a simple 
> program that creates a bunch of threads that immediately exit does not 
> produce the bug as frequently)
>
> You can also make hotblocks produce a similar crash (but make sure to add a 
> g_hash_table_destroy(hotblocks)
> at the end of plugin_exit.)
>

Thanks, I'll give that a try.

>  
>  >  
>  >  From what I remember the race is
>  >  in between preexit_cleanup and the eventual _exit/exit_group which nixes
>  >  all other child threads. Maybe we should be triggering a more graceful
>  >  unload here?
>  >
>  > I think so. This remedies the bug for this particular plugin and I think 
> there
>  > would be a better solution of course. However, I just can't ever get 
> plugin_exit
>  > callback to be called more than once so I think that's probably not the 
> problem?
>  >
>  > The problem is that we *use* the data in translation/mem_access/exec 
> callbacks
>  > after a plugin_exit call is already called (this can be easily verified by 
> having a 
>  > boolean set to true once plugin_exit is called and then g_assert this 
> boolean is 
>  > false in the callbacks)
>
>  We have mechanisms for safely unloading plugins during running so I
>  think we should be able to do something cleanly here. I'll cook up an
>  RFC.
>
> I'll be waiting for it. Note that as I think I mentioned in the cover letter, 
> it's that plugin_uninstall
> is probably problematic since it unregisters callbacks but already-fired 
> callbacks will continue executing.
> So calling plugin_uninstall at the end of plugin_exit does not relieve
> the bug...

That's OK - the plugin_uninstall contract explicitly says that callbacks
may occur until the callback is called.

>  
>  
>  >
>  >  > This is mitigated in this commit by synchronizing the call to
>  >  > `plugin_exit` through locking to ensure execlusive access to data
>  >  > structures, and NULL-ifying those data structures so that subsequent
>  >  > callbacks can check whether the data strucutres are freed, and if so,
>  >  > immediately exit.
>  >  >
>  >  > It's okay to immediately exit and don't account for those callbacks
>  >  > since they won't be accounted for anyway since `plugin_exit` is already
>  >  > called once and reported the statistics.
>  >  >
>  >  > Signed-off-by: Mahmoud Mandour <ma.mando...@gmail.com>
>  >  > ---
>  >  >  contrib/plugins/cache.c | 31 ++++++++++++++++++++++++++++++-
>  >  >  1 file changed, 30 insertions(+), 1 deletion(-)
>  >  >
>  >  > diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
>  >  > index 695fb969dc..a452aba01c 100644
>  >  > --- a/contrib/plugins/cache.c
>  >  > +++ b/contrib/plugins/cache.c
>  >  > @@ -363,6 +363,11 @@ static void vcpu_mem_access(unsigned int 
> vcpu_index, qemu_plugin_meminfo_t info,
>  >  >      effective_addr = hwaddr ? qemu_plugin_hwaddr_phys_addr(hwaddr) : 
> vaddr;
>  >  >  
>  >  >      g_mutex_lock(&mtx);
>  >  > +    if (dcache == NULL) {
>  >  > +        g_mutex_unlock(&mtx);
>  >  > +        return;
>  >  > +    }
>  >  > +
>  >  >      if (!access_cache(dcache, effective_addr)) {
>  >  >          insn = (InsnData *) userdata;
>  >  >          insn->dmisses++;
>  >  > @@ -380,6 +385,11 @@ static void vcpu_insn_exec(unsigned int 
> vcpu_index, void *userdata)
>  >  >      g_mutex_lock(&mtx);
>  >  >      insn_addr = ((InsnData *) userdata)->addr;
>  >  >  
>  >  > +    if (icache == NULL) {
>  >  > +        g_mutex_unlock(&mtx);
>  >  > +        return;
>  >  > +    }
>  >  > +
>  >  >      if (!access_cache(icache, insn_addr)) {
>  >  >          insn = (InsnData *) userdata;
>  >  >          insn->imisses++;
>  >  > @@ -406,12 +416,24 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, 
> struct qemu_plugin_tb *tb)
>  >  >              effective_addr = (uint64_t) qemu_plugin_insn_vaddr(insn);
>  >  >          }
>  >  >  
>  >  > +        g_mutex_lock(&mtx);
>  >  > +
>  >  > +        /*
>  >  > +         * is the plugin_exit callback called? If so, any further 
> callback
>  >  > +         * registration is useless as it won't get accounted for after 
> calling
>  >  > +         * plugin_exit once already, and also will use miss_ht after 
> it's freed
>  >  > +         */
>  >  > +        if (miss_ht == NULL) {
>  >  > +            g_mutex_unlock(&mtx);
>  >  > +            return;
>  >  > +        }
>  >  > +
>  >  >          /*
>  >  >           * Instructions might get translated multiple times, we do not 
> create
>  >  >           * new entries for those instructions. Instead, we fetch the 
> same
>  >  >           * entry from the hash table and register it for the callback 
> again.
>  >  >           */
>  >  > -        g_mutex_lock(&mtx);
>  >  > +
>  >  >          data = g_hash_table_lookup(miss_ht, 
> GUINT_TO_POINTER(effective_addr));
>  >  >          if (data == NULL) {
>  >  >              data = g_new0(InsnData, 1);
>  >  > @@ -527,13 +549,20 @@ static void log_top_insns()
>  >  >  
>  >  >  static void plugin_exit(qemu_plugin_id_t id, void *p)
>  >  >  {
>  >  > +    g_mutex_lock(&mtx);
>  >  >      log_stats();
>  >  >      log_top_insns();
>  >  >  
>  >  >      cache_free(dcache);
>  >  > +    dcache = NULL;
>  >  > +
>  >  >      cache_free(icache);
>  >  > +    icache = NULL;
>  >  >  
>  >  >      g_hash_table_destroy(miss_ht);
>  >  > +    miss_ht = NULL;
>  >  > +
>  >  > +    g_mutex_unlock(&mtx);
>  >  >  }
>  >  >  
>  >  >  static void policy_init()
>  >
>  >  -- 
>  >  Alex Bennée
>
>  -- 
>  Alex Bennée


-- 
Alex Bennée

Re: [PATCH 3/6] plugins/cache: Fixed a use-after-free bug with multithreaded usermode

Reply via email to