| From: Oscar Salvador <osalvador@suse.de> |
| Subject: mm, hugetlb: avoid passing a null nodemask when there is mbind policy |
| Date: Tue, 15 Apr 2025 14:15:03 +0200 |
| |
| Before trying to allocate a page, gather_surplus_pages() sets up a |
| nodemask for the nodes we can allocate from, but instead of passing the |
| nodemask down the road to the page allocator, it iterates over the nodes |
| within that nodemask right there, meaning that the page allocator will |
| receive a preferred_nid and a null nodemask. |
| |
| This is a problem when using a memory policy, because it might be that the |
| page allocator ends up using a node as a fallback which is not represented |
| in the policy. |
| |
| Avoid that by passing the nodemask directly to the page allocator, so it |
| can filter out fallback nodes that are not part of the nodemask. |
| |
| Link: https://lkml.kernel.org/r/20250415121503.376811-1-osalvador@suse.de |
| Signed-off-by: Oscar Salvador <osalvador@suse.de> |
| Reviewed-by: Vlastimil Babka <vbabka@suse.cz> |
| Cc: David Hildenbrand <david@redhat.com> |
| Cc: Muchun Song <muchun.song@linux.dev> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| mm/hugetlb.c | 22 ++++++---------------- |
| 1 file changed, 6 insertions(+), 16 deletions(-) |
| |
| --- a/mm/hugetlb.c~mm-hugetlb-avoid-passing-a-null-nodemask-when-there-is-mbind-policy |
| +++ a/mm/hugetlb.c |
| @@ -2419,7 +2419,6 @@ static int gather_surplus_pages(struct h |
| long i; |
| long needed, allocated; |
| bool alloc_ok = true; |
| - int node; |
| nodemask_t *mbind_nodemask, alloc_nodemask; |
| |
| mbind_nodemask = policy_mbind_nodemask(htlb_alloc_mask(h)); |
| @@ -2443,21 +2442,12 @@ retry: |
| for (i = 0; i < needed; i++) { |
| folio = NULL; |
| |
| - /* Prioritize current node */ |
| - if (node_isset(numa_mem_id(), alloc_nodemask)) |
| - folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h), |
| - numa_mem_id(), NULL); |
| - |
| - if (!folio) { |
| - for_each_node_mask(node, alloc_nodemask) { |
| - if (node == numa_mem_id()) |
| - continue; |
| - folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h), |
| - node, NULL); |
| - if (folio) |
| - break; |
| - } |
| - } |
| + /* |
| + * It is okay to use NUMA_NO_NODE because we use numa_mem_id() |
| + * down the road to pick the current node if that is the case. |
| + */ |
| + folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h), |
| + NUMA_NO_NODE, &alloc_nodemask); |
| if (!folio) { |
| alloc_ok = false; |
| break; |
| _ |