> -----Original Message----- > From: lizhij...@fujitsu.com <lizhij...@fujitsu.com> > Sent: Thursday, March 31, 2022 9:15 AM > To: Zhang, Chen <chen.zh...@intel.com>; Jason Wang > <jasow...@redhat.com> > Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu <like...@linux.intel.com> > Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the > conn_list > > > connection_track_table > -----+---------- > key1 | conn |-----------------------------------------------------------+ > -----+---------- | > key2 | conn |----------------------------------+ | > -----+---------- | | > key3 | conn |---------+ | | > -----+---------- | | | > | | | > | | | > + CompareState ++ | | | > | | V V V > +---------------+ +---------------+ +---------------+ > |conn_list +--->conn +--------->conn | connx > +---------------+ +---------------+ +---------------+ > | | | | | | > +---------------+ +---v----+ +---v----+ +---v----+ +---v----+ > |primary | |secondary |primary | |secondary > |packet | |packet + |packet | |packet + > +--------+ +--------+ +--------+ +--------+ > | | | | > +---v----+ +---v----+ +---v----+ +---v----+ > |primary | |secondary |primary | |secondary > |packet | |packet + |packet | |packet + > +--------+ +--------+ +--------+ +--------+ > | | | | > +---v----+ +---v----+ +---v----+ +---v----+ > |primary | |secondary |primary | |secondary > |packet | |packet + |packet | |packet + > +--------+ +--------+ +--------+ +--------+ > > I recalled that we should above relationships between > connection_track_table conn_list and conn. > That means both connection_track_table and conn_list reference to the > same conn instance. > > So before this patch, connection_get() is possible to use-after-free/double > free conn. where 1st was in > connection_hashtable_reset() and 2nd was > 221 while (!g_queue_is_empty(conn_list)) { > 222 connection_destroy(g_queue_pop_head(conn_list)); > 223 } > > I also doubt that your current abort was just due to above use-after- > free/double free. > If so, looks it's enough we just update to g_queue_clear(conn_list) in the 2nd > place.
Make sense, but It also means the original patch works here, skip free conn in connection_hashtable_reset() and do it in: 221 while (!g_queue_is_empty(conn_list)) { 222 connection_destroy(g_queue_pop_head(conn_list)); 223 }. It also avoid use-after-free/double free conn. Maybe we can keep the original version to fix it? Thanks Chen > > Thanks > Zhijian > > > On 28/03/2022 17:13, Zhang, Chen wrote: > > > >> -----Original Message----- > >> From: lizhij...@fujitsu.com <lizhij...@fujitsu.com> > >> Sent: Monday, March 21, 2022 11:06 AM > >> To: Zhang, Chen <chen.zh...@intel.com>; Jason Wang > >> <jasow...@redhat.com>; lizhij...@fujitsu.com > >> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu > >> <like...@linux.intel.com> > >> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear > >> the conn_list > >> > >> > >> > >> On 09/03/2022 16:38, Zhang Chen wrote: > >>> We notice the QEMU may crash when the guest has too many incoming > >>> network connections with the following log: > >>> > >>> 15197@1593578622.668573:colo_proxy_main : colo proxy connection > >>> hashtable full, clear it > >>> free(): invalid pointer > >>> [1] 15195 abort (core dumped) qemu-system-x86_64 .... > >>> > >>> This is because we create the s->connection_track_table with > >>> g_hash_table_new_full() which is defined as: > >>> > >>> GHashTable * g_hash_table_new_full (GHashFunc hash_func, > >>> GEqualFunc key_equal_func, > >>> GDestroyNotify key_destroy_func, > >>> GDestroyNotify value_destroy_func); > >>> > >>> The fourth parameter connection_destroy() will be called to free the > >>> memory allocated for all 'Connection' values in the hashtable when > >>> we call g_hash_table_remove_all() in the connection_hashtable_reset(). > >>> > >>> It's unnecessary because we clear the conn_list explicitly later, > >>> and it's buggy when other agents try to call connection_get() with > >>> the same connection_track_table. > >>> > >>> Signed-off-by: Like Xu <like...@linux.intel.com> > >>> Signed-off-by: Zhang Chen <chen.zh...@intel.com> > >>> --- > >>> net/colo-compare.c | 2 +- > >>> net/filter-rewriter.c | 2 +- > >>> 2 files changed, 2 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/net/colo-compare.c b/net/colo-compare.c index > >>> 62554b5b3c..ab054cfd21 100644 > >>> --- a/net/colo-compare.c > >>> +++ b/net/colo-compare.c > >>> @@ -1324,7 +1324,7 @@ static void > >> colo_compare_complete(UserCreatable *uc, Error **errp) > >>> s->connection_track_table = > >> g_hash_table_new_full(connection_key_hash, > >>> > >>> connection_key_equal, > >>> g_free, > >>> - > >>> connection_destroy); > >>> + NULL); > >> > >> 202 /* if not found, create a new connection and add to hash table */ > >> 203 Connection *connection_get(GHashTable *connection_track_table, > >> 204 ConnectionKey *key, > >> 205 GQueue *conn_list) > >> 206 { > >> 207 Connection *conn = > >> g_hash_table_lookup(connection_track_table, > >> key); > >> 208 > >> 209 if (conn == NULL) { > >> 210 ConnectionKey *new_key = g_memdup(key, sizeof(*key)); > >> 211 > >> 212 conn = connection_new(key); > >> 213 > >> 214 if (g_hash_table_size(connection_track_table) > > >> HASHTABLE_MAX_SIZE) { > >> 215 trace_colo_proxy_main("colo proxy connection hashtable > >> full," > >> 216 " clear it"); > >> 217 connection_hashtable_reset(connection_track_table); > >> > >> 197 void connection_hashtable_reset(GHashTable > >> *connection_track_table) > >> 198 { > >> 199 g_hash_table_remove_all(connection_track_table); > >> 200 } > >> > >> IIUC, above subroutine will do some cleanup explicitly. And before > >> your patch, connection_hashtable_reset() will release all keys and > >> their values in this hashtable. But now, you remove all keys and just > >> one value(conn_list) instead. Does it means other values will be leaked ? > > Thanks Zhijian. Re-think about it. Yes, I think you are right. > > It looks need a lock to avoid into connection_get() when triggered the clear > hashtable operation. > > What do you think? > > > > Thanks > > Chen > > > > > >> > >> 218 /* > >> 219 * clear the conn_list > >> 220 */ > >> 221 while (!g_queue_is_empty(conn_list)) { > >> 222 connection_destroy(g_queue_pop_head(conn_list)); > >> 223 } > >> 224 } > >> 225 > >> 226 g_hash_table_insert(connection_track_table, new_key, > >> conn); > >> 227 } > >> 228 > >> 229 return conn; > >> 230 } > >> > >> > >> Thanks > >> Zhijian > >> > >>> colo_compare_iothread(s); > >>> > >>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index > >>> bf05023dc3..c18c4c2019 100644 > >>> --- a/net/filter-rewriter.c > >>> +++ b/net/filter-rewriter.c > >>> @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState > >>> *nf, > >> Error **errp) > >>> s->connection_track_table = > >> g_hash_table_new_full(connection_key_hash, > >>> > >>> connection_key_equal, > >>> g_free, > >>> - > >>> connection_destroy); > >>> + NULL); > >>> s->incoming_queue = > >> qemu_new_net_queue(qemu_netfilter_pass_to_next, nf); > >>> } > >>>