| From c9160be4ea528ca36d7cbdc86664919e691fd87d Mon Sep 17 00:00:00 2001 |
| From: Rafael Aquini <aquini@linux.com> |
| Date: Wed, 15 Jun 2011 15:08:39 -0700 |
| Subject: [PATCH] mm: fix negative commitlimit when gigantic hugepages are |
| allocated |
| |
| commit b0320c7b7d1ac1bd5c2d9dff3258524ab39bad32 upstream. |
| |
| When 1GB hugepages are allocated on a system, free(1) reports less |
| available memory than what really is installed in the box. Also, if the |
| total size of hugepages allocated on a system is over half of the total |
| memory size, CommitLimit becomes a negative number. |
| |
| The problem is that gigantic hugepages (order > MAX_ORDER) can only be |
| allocated at boot with bootmem, thus its frames are not accounted to |
| 'totalram_pages'. However, they are accounted to hugetlb_total_pages() |
| |
| What happens to turn CommitLimit into a negative number is this |
| calculation, in fs/proc/meminfo.c: |
| |
| allowed = ((totalram_pages - hugetlb_total_pages()) |
| * sysctl_overcommit_ratio / 100) + total_swap_pages; |
| |
| A similar calculation occurs in __vm_enough_memory() in mm/mmap.c. |
| |
| Also, every vm statistic which depends on 'totalram_pages' will render |
| confusing values, as if system were 'missing' some part of its memory. |
| |
| Impact of this bug: |
| |
| When gigantic hugepages are allocated and sysctl_overcommit_memory == |
| OVERCOMMIT_NEVER. In a such situation, __vm_enough_memory() goes through |
| the mentioned 'allowed' calculation and might end up mistakenly returning |
| -ENOMEM, thus forcing the system to start reclaiming pages earlier than it |
| would be ususal, and this could cause detrimental impact to overall |
| system's performance, depending on the workload. |
| |
| Besides the aforementioned scenario, I can only think of this causing |
| annoyances with memory reports from /proc/meminfo and free(1). |
| |
| [akpm@linux-foundation.org: standardize comment layout] |
| Reported-by: Russ Anderson <rja@sgi.com> |
| Signed-off-by: Rafael Aquini <aquini@linux.com> |
| Acked-by: Russ Anderson <rja@sgi.com> |
| Cc: Andrea Arcangeli <aarcange@redhat.com> |
| Cc: Christoph Lameter <cl@linux.com> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| |
| Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> |
| --- |
| mm/hugetlb.c | 8 ++++++++ |
| 1 file changed, 8 insertions(+) |
| |
| diff --git a/mm/hugetlb.c b/mm/hugetlb.c |
| index 2583bbe..ca9ce49 100644 |
| --- a/mm/hugetlb.c |
| +++ b/mm/hugetlb.c |
| @@ -1105,6 +1105,14 @@ static void __init gather_bootmem_prealloc(void) |
| WARN_ON(page_count(page) != 1); |
| prep_compound_huge_page(page, h->order); |
| prep_new_huge_page(h, page, page_to_nid(page)); |
| + /* |
| + * If we had gigantic hugepages allocated at boot time, we need |
| + * to restore the 'stolen' pages to totalram_pages in order to |
| + * fix confusing memory reports from free(1) and another |
| + * side-effects, like CommitLimit going negative. |
| + */ |
| + if (h->order > (MAX_ORDER - 1)) |
| + totalram_pages += 1 << h->order; |
| } |
| } |
| |
| -- |
| 1.7.9.6 |
| |