| From foo@baz Fri Mar 9 14:15:30 PST 2018 |
| From: Daniel Borkmann <daniel@iogearbox.net> |
| Date: Thu, 8 Mar 2018 13:16:43 +0100 |
| Subject: bpf: fix memory leak in lpm_trie map_free callback function |
| To: gregkh@linuxfoundation.org |
| Cc: ast@kernel.org, daniel@iogearbox.net, stable@vger.kernel.org, Yonghong Song <yhs@fb.com> |
| Message-ID: <92a9bac0950f4c6def7378bc548eeaea6518b4da.1520507630.git.daniel@iogearbox.net> |
| |
| From: Yonghong Song <yhs@fb.com> |
| |
| [ upstream commit 9a3efb6b661f71d5675369ace9257833f0e78ef3 ] |
| |
| There is a memory leak happening in lpm_trie map_free callback |
| function trie_free. The trie structure itself does not get freed. |
| |
| Also, trie_free function did not do synchronize_rcu before freeing |
| various data structures. This is incorrect as some rcu_read_lock |
| region(s) for lookup, update, delete or get_next_key may not complete yet. |
| The fix is to add synchronize_rcu in the beginning of trie_free. |
| The useless spin_lock is removed from this function as well. |
| |
| Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation") |
| Reported-by: Mathieu Malaterre <malat@debian.org> |
| Reported-by: Alexei Starovoitov <ast@kernel.org> |
| Tested-by: Mathieu Malaterre <malat@debian.org> |
| Signed-off-by: Yonghong Song <yhs@fb.com> |
| Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
| Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| kernel/bpf/lpm_trie.c | 11 +++++++---- |
| 1 file changed, 7 insertions(+), 4 deletions(-) |
| |
| --- a/kernel/bpf/lpm_trie.c |
| +++ b/kernel/bpf/lpm_trie.c |
| @@ -560,7 +560,10 @@ static void trie_free(struct bpf_map *ma |
| struct lpm_trie_node __rcu **slot; |
| struct lpm_trie_node *node; |
| |
| - raw_spin_lock(&trie->lock); |
| + /* Wait for outstanding programs to complete |
| + * update/lookup/delete/get_next_key and free the trie. |
| + */ |
| + synchronize_rcu(); |
| |
| /* Always start at the root and walk down to a node that has no |
| * children. Then free that node, nullify its reference in the parent |
| @@ -574,7 +577,7 @@ static void trie_free(struct bpf_map *ma |
| node = rcu_dereference_protected(*slot, |
| lockdep_is_held(&trie->lock)); |
| if (!node) |
| - goto unlock; |
| + goto out; |
| |
| if (rcu_access_pointer(node->child[0])) { |
| slot = &node->child[0]; |
| @@ -592,8 +595,8 @@ static void trie_free(struct bpf_map *ma |
| } |
| } |
| |
| -unlock: |
| - raw_spin_unlock(&trie->lock); |
| +out: |
| + kfree(trie); |
| } |
| |
| static int trie_get_next_key(struct bpf_map *map, void *key, void *next_key) |