| From akpm@linux-foundation.org Thu Oct 1 15:24:58 2009 |
| From: Lee Schermerhorn <Lee.Schermerhorn@hp.com> |
| Date: Mon, 21 Sep 2009 17:01:04 -0700 |
| Subject: hugetlb: restore interleaving of bootmem huge pages (2.6.31) |
| To: torvalds@linux-foundation.org |
| Cc: Lee.Schermerhorn@hp.com, lee.schermerhorn@hp.com, ak@linux.intel.com, eric.whitney@hp.com, mel@csn.ul.ie, rientjes@google.com, agl@us.ibm.com, apw@canonical.com, akpm@linux-foundation.org, stable@kernel.org |
| Message-ID: <200909220001.n8M014vN026389@imap1.linux-foundation.org> |
| |
| |
| From: Lee Schermerhorn <Lee.Schermerhorn@hp.com> |
| |
| Not upstream as it is fixed differently in .32 |
| |
| I noticed that alloc_bootmem_huge_page() will only advance to the next |
| node on failure to allocate a huge page. I asked about this on linux-mm |
| and linux-numa, cc'ing the usual huge page suspects. Mel Gorman |
| responded: |
| |
| I strongly suspect that the same node being used until allocation |
| failure instead of round-robin is an oversight and not deliberate |
| at all. It appears to be a side-effect of a fix made way back in |
| commit 63b4613c3f0d4b724ba259dc6c201bb68b884e1a ["hugetlb: fix |
| hugepage allocation with memoryless nodes"]. Prior to that patch |
| it looked like allocations would always round-robin even when |
| allocation was successful. |
| |
| Andy Whitcroft countered that the existing behavior looked like Andi |
| Kleen's original implementation and suggested that we ask him. We did and |
| Andy replied that his intention was to interleave the allocations. So, |
| ... |
| |
| This patch moves the advance of the hstate next node from which to |
| allocate up before the test for success of the attempted allocation. This |
| will unconditionally advance the next node from which to alloc, |
| interleaving successful allocations over the nodes with sufficient |
| contiguous memory, and skipping over nodes that fail the huge page |
| allocation attempt. |
| |
| Note that alloc_bootmem_huge_page() will only be called for huge pages of |
| order > MAX_ORDER. |
| |
| Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> |
| Reviewed-by: Andi Kleen <ak@linux.intel.com> |
| Cc: Mel Gorman <mel@csn.ul.ie> |
| Cc: David Rientjes <rientjes@google.com> |
| Cc: Adam Litke <agl@us.ibm.com> |
| Cc: Andy Whitcroft <apw@canonical.com> |
| Cc: Eric Whitney <eric.whitney@hp.com> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
| |
| --- |
| mm/hugetlb.c | 2 +- |
| 1 file changed, 1 insertion(+), 1 deletion(-) |
| |
| --- a/mm/hugetlb.c |
| +++ b/mm/hugetlb.c |
| @@ -983,6 +983,7 @@ __attribute__((weak)) int alloc_bootmem_ |
| NODE_DATA(h->hugetlb_next_nid), |
| huge_page_size(h), huge_page_size(h), 0); |
| |
| + hstate_next_node(h); |
| if (addr) { |
| /* |
| * Use the beginning of the huge page to store the |
| @@ -993,7 +994,6 @@ __attribute__((weak)) int alloc_bootmem_ |
| if (m) |
| goto found; |
| } |
| - hstate_next_node(h); |
| nr_nodes--; |
| } |
| return 0; |