colo: Fix a "double free" crash to clear the conn_list

lizhij...@fujitsu.com Wed, 30 Mar 2022 18:15:12 -0700

connection_track_table
-----+----------
key1 | conn    |-----------------------------------------------------------+
-----+----------                                                           |
key2 | conn    |----------------------------------+                        |
-----+----------                                  |                        |
key3 | conn    |---------+                        |                        |
-----+----------         |                        |                        |
                          |                        |                        |
                          |                        |                        |
     + CompareState ++    |                        |                        |
     |               |    V                        V                        V
     +---------------+   +---------------+         +---------------+
     |conn_list      +--->conn           +--------->conn           |     connx
     +---------------+   +---------------+         +---------------+
     |               |     |           |             |          |
     +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
                       |primary |  |secondary    |primary | |secondary
                       |packet  |  |packet  +    |packet  | |packet  +
                       +--------+  +--------+    +--------+ +--------+
                           |           |             |          |
                       +---v----+  +---v----+    +---v----+ +---v----+
                       |primary |  |secondary    |primary | |secondary
                       |packet  |  |packet  +    |packet  | |packet  +
                       +--------+  +--------+    +--------+ +--------+
                           |           |             |          |
                       +---v----+  +---v----+    +---v----+ +---v----+
                       |primary |  |secondary    |primary | |secondary
                       |packet  |  |packet  +    |packet  | |packet  +
                       +--------+  +--------+    +--------+ +--------+
  
I recalled that we should above relationships between connection_track_table 
conn_list and conn.
That means both connection_track_table and conn_list reference to the same conn 
instance.


So before this patch, connection_get() is possible to use-after-free/double 
free conn. where 1st was in
connection_hashtable_reset() and 2nd was
221             while (!g_queue_is_empty(conn_list)) {
222                 connection_destroy(g_queue_pop_head(conn_list));
223             }

I also doubt that your current abort was just due to above 
use-after-free/double free.
If so, looks it's enough we just update to g_queue_clear(conn_list) in the 2nd 
place.

Thanks
Zhijian


On 28/03/2022 17:13, Zhang, Chen wrote:
>
>> -----Original Message-----
>> From: lizhij...@fujitsu.com <lizhij...@fujitsu.com>
>> Sent: Monday, March 21, 2022 11:06 AM
>> To: Zhang, Chen <chen.zh...@intel.com>; Jason Wang
>> <jasow...@redhat.com>; lizhij...@fujitsu.com
>> Cc: qemu-dev <qemu-devel@nongnu.org>; Like Xu <like...@linux.intel.com>
>> Subject: Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the
>> conn_list
>>
>>
>>
>> On 09/03/2022 16:38, Zhang Chen wrote:
>>> We notice the QEMU may crash when the guest has too many incoming
>>> network connections with the following log:
>>>
>>> 15197@1593578622.668573:colo_proxy_main : colo proxy connection
>>> hashtable full, clear it
>>> free(): invalid pointer
>>> [1]    15195 abort (core dumped)  qemu-system-x86_64 ....
>>>
>>> This is because we create the s->connection_track_table with
>>> g_hash_table_new_full() which is defined as:
>>>
>>> GHashTable * g_hash_table_new_full (GHashFunc hash_func,
>>>                          GEqualFunc key_equal_func,
>>>                          GDestroyNotify key_destroy_func,
>>>                          GDestroyNotify value_destroy_func);
>>>
>>> The fourth parameter connection_destroy() will be called to free the
>>> memory allocated for all 'Connection' values in the hashtable when we
>>> call g_hash_table_remove_all() in the connection_hashtable_reset().
>>>
>>> It's unnecessary because we clear the conn_list explicitly later, and
>>> it's buggy when other agents try to call connection_get() with the
>>> same connection_track_table.
>>>
>>> Signed-off-by: Like Xu <like...@linux.intel.com>
>>> Signed-off-by: Zhang Chen <chen.zh...@intel.com>
>>> ---
>>>    net/colo-compare.c    | 2 +-
>>>    net/filter-rewriter.c | 2 +-
>>>    2 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/colo-compare.c b/net/colo-compare.c index
>>> 62554b5b3c..ab054cfd21 100644
>>> --- a/net/colo-compare.c
>>> +++ b/net/colo-compare.c
>>> @@ -1324,7 +1324,7 @@ static void
>> colo_compare_complete(UserCreatable *uc, Error **errp)
>>>        s->connection_track_table =
>> g_hash_table_new_full(connection_key_hash,
>>>                                                          
>>> connection_key_equal,
>>>                                                          g_free,
>>> -                                                      connection_destroy);
>>> +                                                      NULL);
>>
>> 202 /* if not found, create a new connection and add to hash table */
>> 203 Connection *connection_get(GHashTable *connection_track_table,
>> 204                            ConnectionKey *key,
>> 205                            GQueue *conn_list)
>> 206 {
>> 207     Connection *conn = g_hash_table_lookup(connection_track_table,
>> key);
>> 208
>> 209     if (conn == NULL) {
>> 210         ConnectionKey *new_key = g_memdup(key, sizeof(*key));
>> 211
>> 212         conn = connection_new(key);
>> 213
>> 214         if (g_hash_table_size(connection_track_table) >
>> HASHTABLE_MAX_SIZE) {
>> 215             trace_colo_proxy_main("colo proxy connection hashtable full,"
>> 216                                   " clear it");
>> 217 connection_hashtable_reset(connection_track_table);
>>
>> 197 void connection_hashtable_reset(GHashTable *connection_track_table)
>> 198 {
>> 199 g_hash_table_remove_all(connection_track_table);
>> 200 }
>>
>> IIUC,  above subroutine will do some cleanup explicitly. And before your
>> patch, connection_hashtable_reset() will release all keys and their values in
>> this hashtable. But now, you remove all keys and just one value(conn_list)
>> instead. Does it means other values will be leaked ?
> Thanks Zhijian. Re-think about it. Yes, I think you are right.
> It looks need a lock to avoid into connection_get() when triggered the clear 
> hashtable operation.
> What do you think?
>
> Thanks
> Chen
>
>
>>
>> 218 /*
>> 219              * clear the conn_list
>> 220 */
>> 221             while (!g_queue_is_empty(conn_list)) {
>> 222 connection_destroy(g_queue_pop_head(conn_list));
>> 223 }
>> 224 }
>> 225
>> 226         g_hash_table_insert(connection_track_table, new_key, conn);
>> 227 }
>> 228
>> 229     return conn;
>> 230 }
>>
>>
>> Thanks
>> Zhijian
>>
>>>        colo_compare_iothread(s);
>>>
>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index
>>> bf05023dc3..c18c4c2019 100644
>>> --- a/net/filter-rewriter.c
>>> +++ b/net/filter-rewriter.c
>>> @@ -383,7 +383,7 @@ static void colo_rewriter_setup(NetFilterState *nf,
>> Error **errp)
>>>        s->connection_track_table =
>> g_hash_table_new_full(connection_key_hash,
>>>                                                          
>>> connection_key_equal,
>>>                                                          g_free,
>>> -                                                      connection_destroy);
>>> +                                                      NULL);
>>>        s->incoming_queue =
>> qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
>>>    }
>>>

Re: [PATCH 2/4] net/colo: Fix a "double free" crash to clear the conn_list

Reply via email to