On Mon, 2018-02-12 at 14:15 -0800, Eric Dumazet wrote: > On Mon, 2018-02-12 at 13:58 -0800, Yonghong Song wrote: > > There is a memory leak happening in lpm_trie map_free callback > > function trie_free. The trie structure itself does not get freed. > > > > Also, trie_free function did not do synchronize_rcu before freeing > > various data structures. This is incorrect as some rcu_read_lock > > region(s) for lookup, update, delete or get_next_key may not complete yet. > > The fix is to add synchronize_rcu in the beginning of trie_free. > > The useless spin_lock is removed from this function as well. > > > > Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map > > implementation") > > Reported-by: Mathieu Malaterre <ma...@debian.org> > > Reported-by: Alexei Starovoitov <a...@kernel.org> > > Tested-by: Mathieu Malaterre <ma...@debian.org> > > Signed-off-by: Yonghong Song <y...@fb.com> > > --- > > kernel/bpf/lpm_trie.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c > > index 7b469d1..9b41ea4 100644 > > --- a/kernel/bpf/lpm_trie.c > > +++ b/kernel/bpf/lpm_trie.c > > @@ -555,7 +555,12 @@ static void trie_free(struct bpf_map *map) > > struct lpm_trie_node __rcu **slot; > > struct lpm_trie_node *node; > > > > - raw_spin_lock(&trie->lock); > > + /* at this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0, > > + * so the programs (can be more than one that used this map) were > > + * disconnected from events. Wait for outstanding programs to complete > > + * update/lookup/delete/get_next_key and free the trie. > > + */ > > + synchronize_rcu(); > > > > Please do not do that. > > Use kfree_rcu() instead (adding one struct rcu_head in struct lpm_trie)
Oh well, I take this back. It looks we heavily use synchronize_rcu() all over the places for ->map_free() already.