> -----Original Message-----
> From: lizhij...@fujitsu.com <lizhij...@fujitsu.com>
> Sent: Thursday, March 31, 2022 9:15 AM
> To: Zhang, Chen <chen.zh...@intel.com>; Jason Wang
> <jasow...@redhat.com>
> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu <like...@linux.intel.com>
> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the
> conn_list
> 
> 
> connection_track_table
> -----+----------
> key1 | conn    |-----------------------------------------------------------+
> -----+----------                                                           |
> key2 | conn    |----------------------------------+                        |
> -----+----------                                  |                        |
> key3 | conn    |---------+                        |                        |
> -----+----------         |                        |                        |
>                           |                        |                        |
>                           |                        |                        |
>      + CompareState ++    |                        |                        |
>      |               |    V                        V                        V
>      +---------------+   +---------------+         +---------------+
>      |conn_list      +--->conn           +--------->conn           |     connx
>      +---------------+   +---------------+         +---------------+
>      |               |     |           |             |          |
>      +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
>                        |primary |  |secondary    |primary | |secondary
>                        |packet  |  |packet  +    |packet  | |packet  +
>                        +--------+  +--------+    +--------+ +--------+
>                            |           |             |          |
>                        +---v----+  +---v----+    +---v----+ +---v----+
>                        |primary |  |secondary    |primary | |secondary
>                        |packet  |  |packet  +    |packet  | |packet  +
>                        +--------+  +--------+    +--------+ +--------+
>                            |           |             |          |
>                        +---v----+  +---v----+    +---v----+ +---v----+
>                        |primary |  |secondary    |primary | |secondary
>                        |packet  |  |packet  +    |packet  | |packet  +
>                        +--------+  +--------+    +--------+ +--------+
> 
> I recalled that we should above relationships between
> connection_track_table conn_list and conn.
> That means both connection_track_table and conn_list reference to the
> same conn instance.
> 
> So before this patch, connection_get() is possible to use-after-free/double
> free conn. where 1st was in
> connection_hashtable_reset() and 2nd was
> 221             while (!g_queue_is_empty(conn_list)) {
> 222                 connection_destroy(g_queue_pop_head(conn_list));
> 223             }
> 
> I also doubt that your current abort was just due to above use-after-
> free/double free.
> If so, looks it's enough we just update to g_queue_clear(conn_list) in the 2nd
> place.

Make sense, but It also means the original patch works here, skip free conn in 
connection_hashtable_reset() and do it in:
221             while (!g_queue_is_empty(conn_list)) {
 222                 connection_destroy(g_queue_pop_head(conn_list));
 223             }.
It also avoid use-after-free/double free conn.
Maybe we can keep the original version to fix it?

Thanks
Chen

> 
> Thanks
> Zhijian
> 
> 
> On 28/03/2022 17:13, Zhang, Chen wrote:
> >
> >> -----Original Message-----
> >> From: lizhij...@fujitsu.com <lizhij...@fujitsu.com>
> >> Sent: Monday, March 21, 2022 11:06 AM
> >> To: Zhang, Chen <chen.zh...@intel.com>; Jason Wang
> >> <jasow...@redhat.com>; lizhij...@fujitsu.com
> >> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu
> >> <like...@linux.intel.com>
> >> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear
> >> the conn_list
> >>
> >>
> >>
> >> On 09/03/2022 16:38, Zhang Chen wrote:
> >>> We notice the QEMU may crash when the guest has too many incoming
> >>> network connections with the following log:
> >>>
> >>> 15197@1593578622.668573:colo_proxy_main : colo proxy connection
> >>> hashtable full, clear it
> >>> free(): invalid pointer
> >>> [1]    15195 abort (core dumped)  qemu-system-x86_64 ....
> >>>
> >>> This is because we create the s->connection_track_table with
> >>> g_hash_table_new_full() which is defined as:
> >>>
> >>> GHashTable * g_hash_table_new_full (GHashFunc hash_func,
> >>>                          GEqualFunc key_equal_func,
> >>>                          GDestroyNotify key_destroy_func,
> >>>                          GDestroyNotify value_destroy_func);
> >>>
> >>> The fourth parameter connection_destroy() will be called to free the
> >>> memory allocated for all 'Connection' values in the hashtable when
> >>> we call g_hash_table_remove_all() in the connection_hashtable_reset().
> >>>
> >>> It's unnecessary because we clear the conn_list explicitly later,
> >>> and it's buggy when other agents try to call connection_get() with
> >>> the same connection_track_table.
> >>>
> >>> Signed-off-by: Like Xu <like...@linux.intel.com>
> >>> Signed-off-by: Zhang Chen <chen.zh...@intel.com>
> >>> ---
> >>>    net/colo-compare.c    | 2 +-
> >>>    net/filter-rewriter.c | 2 +-
> >>>    2 files changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/colo-compare.c b/net/colo-compare.c index
> >>> 62554b5b3c..ab054cfd21 100644
> >>> --- a/net/colo-compare.c
> >>> +++ b/net/colo-compare.c
> >>> @@ -1324,7 +1324,7 @@ static void
> >> colo_compare_complete(UserCreatable *uc, Error **errp)
> >>>        s->connection_track_table =
> >> g_hash_table_new_full(connection_key_hash,
> >>>                                                          
> >>> connection_key_equal,
> >>>                                                          g_free,
> >>> -                                                      
> >>> connection_destroy);
> >>> +                                                      NULL);
> >>
> >> 202 /* if not found, create a new connection and add to hash table */
> >> 203 Connection *connection_get(GHashTable *connection_track_table,
> >> 204                            ConnectionKey *key,
> >> 205                            GQueue *conn_list)
> >> 206 {
> >> 207     Connection *conn =
> >> g_hash_table_lookup(connection_track_table,
> >> key);
> >> 208
> >> 209     if (conn == NULL) {
> >> 210         ConnectionKey *new_key = g_memdup(key, sizeof(*key));
> >> 211
> >> 212         conn = connection_new(key);
> >> 213
> >> 214         if (g_hash_table_size(connection_track_table) >
> >> HASHTABLE_MAX_SIZE) {
> >> 215             trace_colo_proxy_main("colo proxy connection hashtable 
> >> full,"
> >> 216                                   " clear it");
> >> 217 connection_hashtable_reset(connection_track_table);
> >>
> >> 197 void connection_hashtable_reset(GHashTable
> >> *connection_track_table)
> >> 198 {
> >> 199 g_hash_table_remove_all(connection_track_table);
> >> 200 }
> >>
> >> IIUC,  above subroutine will do some cleanup explicitly. And before
> >> your patch, connection_hashtable_reset() will release all keys and
> >> their values in this hashtable. But now, you remove all keys and just
> >> one value(conn_list) instead. Does it means other values will be leaked ?
> > Thanks Zhijian. Re-think about it. Yes, I think you are right.
> > It looks need a lock to avoid into connection_get() when triggered the clear
> hashtable operation.
> > What do you think?
> >
> > Thanks
> > Chen
> >
> >
> >>
> >> 218 /*
> >> 219              * clear the conn_list
> >> 220 */
> >> 221             while (!g_queue_is_empty(conn_list)) {
> >> 222 connection_destroy(g_queue_pop_head(conn_list));
> >> 223 }
> >> 224 }
> >> 225
> >> 226         g_hash_table_insert(connection_track_table, new_key,
> >> conn);
> >> 227 }
> >> 228
> >> 229     return conn;
> >> 230 }
> >>
> >>
> >> Thanks
> >> Zhijian
> >>
> >>>        colo_compare_iothread(s);
> >>>
> >>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index
> >>> bf05023dc3..c18c4c2019 100644
> >>> --- a/net/filter-rewriter.c
> >>> +++ b/net/filter-rewriter.c
> >>> @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState
> >>> *nf,
> >> Error **errp)
> >>>        s->connection_track_table =
> >> g_hash_table_new_full(connection_key_hash,
> >>>                                                          
> >>> connection_key_equal,
> >>>                                                          g_free,
> >>> -                                                      
> >>> connection_destroy);
> >>> +                                                      NULL);
> >>>        s->incoming_queue =
> >> qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
> >>>    }
> >>>

Reply via email to