| .\" Copyright 2003,2004 Andi Kleen, SuSE Labs. |
| .\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard |
| .\" |
| .\" %%%LICENSE_START(VERBATIM_PROF) |
| .\" Permission is granted to make and distribute verbatim copies of this |
| .\" manual provided the copyright notice and this permission notice are |
| .\" preserved on all copies. |
| .\" |
| .\" Permission is granted to copy and distribute modified versions of this |
| .\" manual under the conditions for verbatim copying, provided that the |
| .\" entire resulting derived work is distributed under the terms of a |
| .\" permission notice identical to this one. |
| .\" |
| .\" Since the Linux kernel and libraries are constantly changing, this |
| .\" manual page may be incorrect or out-of-date. The author(s) assume no |
| .\" responsibility for errors or omissions, or for damages resulting from |
| .\" the use of the information contained herein. |
| .\" |
| .\" Formatted or processed versions of this manual, if unaccompanied by |
| .\" the source, must acknowledge the copyright and authors of this work. |
| .\" %%%LICENSE_END |
| .\" |
| .\" 2006-02-03, mtk, substantial wording changes and other improvements |
| .\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com> |
| .\" more precise specification of behavior. |
| .\" |
| .TH SET_MEMPOLICY 2 2015-05-07 Linux "Linux Programmer's Manual" |
| .SH NAME |
| set_mempolicy \- set default NUMA memory policy for a thread and its children |
| .SH SYNOPSIS |
| .nf |
| .B "#include <numaif.h>" |
| .sp |
| .BI "long set_mempolicy(int " mode ", const unsigned long *" nodemask , |
| .BI " unsigned long " maxnode ); |
| .sp |
| Link with \fI\-lnuma\fP. |
| .fi |
| .SH DESCRIPTION |
| .BR set_mempolicy () |
| sets the NUMA memory policy of the calling thread, |
| which consists of a policy mode and zero or more nodes, |
| to the values specified by the |
| .IR mode , |
| .I nodemask |
| and |
| .I maxnode |
| arguments. |
| |
| A NUMA machine has different |
| memory controllers with different distances to specific CPUs. |
| The memory policy defines from which node memory is allocated for |
| the thread. |
| |
| This system call defines the default policy for the thread. |
| The thread policy governs allocation of pages in the process's |
| address space outside of memory ranges |
| controlled by a more specific policy set by |
| .BR mbind (2). |
| The thread default policy also controls allocation of any pages for |
| memory-mapped files mapped using the |
| .BR mmap (2) |
| call with the |
| .B MAP_PRIVATE |
| flag and that are only read [loaded] from by the thread |
| and of memory-mapped files mapped using the |
| .BR mmap (2) |
| call with the |
| .B MAP_SHARED |
| flag, regardless of the access type. |
| The policy is applied only when a new page is allocated |
| for the thread. |
| For anonymous memory this is when the page is first |
| touched by the thread. |
| |
| The |
| .I mode |
| argument must specify one of |
| .BR MPOL_DEFAULT , |
| .BR MPOL_BIND , |
| .BR MPOL_INTERLEAVE , |
| or |
| .BR MPOL_PREFERRED . |
| All modes except |
| .B MPOL_DEFAULT |
| require the caller to specify via the |
| .I nodemask |
| argument one or more nodes. |
| |
| The |
| .I mode |
| argument may also include an optional |
| .IR "mode flag" . |
| The supported |
| .I "mode flags" |
| are: |
| .TP |
| .BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)" |
| A nonempty |
| .I nodemask |
| specifies physical node ids. |
| Linux will not remap the |
| .I nodemask |
| when the process moves to a different cpuset context, |
| nor when the set of nodes allowed by the process's |
| current cpuset context changes. |
| .TP |
| .BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)" |
| A nonempty |
| .I nodemask |
| specifies node ids that are relative to the set of |
| node ids allowed by the process's current cpuset. |
| .PP |
| .I nodemask |
| points to a bit mask of node IDs that contains up to |
| .I maxnode |
| bits. |
| The bit mask size is rounded to the next multiple of |
| .IR "sizeof(unsigned long)" , |
| but the kernel will use bits only up to |
| .IR maxnode . |
| A NULL value of |
| .I nodemask |
| or a |
| .I maxnode |
| value of zero specifies the empty set of nodes. |
| If the value of |
| .I maxnode |
| is zero, |
| the |
| .I nodemask |
| argument is ignored. |
| |
| Where a |
| .I nodemask |
| is required, it must contain at least one node that is on-line, |
| allowed by the process's current cpuset context, |
| [unless the |
| .B MPOL_F_STATIC_NODES |
| mode flag is specified], |
| and contains memory. |
| If the |
| .B MPOL_F_STATIC_NODES |
| is set in |
| .I mode |
| and a required |
| .I nodemask |
| contains no nodes that are allowed by the process's current cpuset context, |
| the memory policy reverts to |
| .IR "local allocation" . |
| This effectively overrides the specified policy until the process's |
| cpuset context includes one or more of the nodes specified by |
| .IR nodemask . |
| |
| The |
| .B MPOL_DEFAULT |
| mode specifies that any nondefault thread memory policy be removed, |
| so that the memory policy "falls back" to the system default policy. |
| The system default policy is "local allocation"\(emthat is, |
| allocate memory on the node of the CPU that triggered the allocation. |
| .I nodemask |
| must be specified as NULL. |
| If the "local node" contains no free memory, the system will |
| attempt to allocate memory from a "near by" node. |
| |
| The |
| .B MPOL_BIND |
| mode defines a strict policy that restricts memory allocation to the |
| nodes specified in |
| .IR nodemask . |
| If |
| .I nodemask |
| specifies more than one node, page allocations will come from |
| the node with the lowest numeric node ID first, until that node |
| contains no free memory. |
| Allocations will then come from the node with the next highest |
| node ID specified in |
| .I nodemask |
| and so forth, until none of the specified nodes contain free memory. |
| Pages will not be allocated from any node not specified in the |
| .IR nodemask . |
| |
| .B MPOL_INTERLEAVE |
| interleaves page allocations across the nodes specified in |
| .I nodemask |
| in numeric node ID order. |
| This optimizes for bandwidth instead of latency |
| by spreading out pages and memory accesses to those pages across |
| multiple nodes. |
| However, accesses to a single page will still be limited to |
| the memory bandwidth of a single node. |
| .\" NOTE: the following sentence doesn't make sense in the context |
| .\" of set_mempolicy() -- no memory area specified. |
| .\" To be effective the memory area should be fairly large, |
| .\" at least 1MB or bigger. |
| |
| .B MPOL_PREFERRED |
| sets the preferred node for allocation. |
| The kernel will try to allocate pages from this node first |
| and fall back to "near by" nodes if the preferred node is low on free |
| memory. |
| If |
| .I nodemask |
| specifies more than one node ID, the first node in the |
| mask will be selected as the preferred node. |
| If the |
| .I nodemask |
| and |
| .I maxnode |
| arguments specify the empty set, then the policy |
| specifies "local allocation" |
| (like the system default policy discussed above). |
| |
| The thread memory policy is preserved across an |
| .BR execve (2), |
| and is inherited by child threads created using |
| .BR fork (2) |
| or |
| .BR clone (2). |
| .SH RETURN VALUE |
| On success, |
| .BR set_mempolicy () |
| returns 0; |
| on error, \-1 is returned and |
| .I errno |
| is set to indicate the error. |
| .SH ERRORS |
| .TP |
| .B EFAULT |
| Part of all of the memory range specified by |
| .I nodemask |
| and |
| .I maxnode |
| points outside your accessible address space. |
| .TP |
| .B EINVAL |
| .I mode |
| is invalid. |
| Or, |
| .I mode |
| is |
| .B MPOL_DEFAULT |
| and |
| .I nodemask |
| is nonempty, |
| or |
| .I mode |
| is |
| .B MPOL_BIND |
| or |
| .B MPOL_INTERLEAVE |
| and |
| .I nodemask |
| is empty. |
| Or, |
| .I maxnode |
| specifies more than a page worth of bits. |
| Or, |
| .I nodemask |
| specifies one or more node IDs that are |
| greater than the maximum supported node ID. |
| Or, none of the node IDs specified by |
| .I nodemask |
| are on-line and allowed by the process's current cpuset context, |
| or none of the specified nodes contain memory. |
| Or, the |
| .I mode |
| argument specified both |
| .B MPOL_F_STATIC_NODES |
| and |
| .BR MPOL_F_RELATIVE_NODES . |
| .TP |
| .B ENOMEM |
| Insufficient kernel memory was available. |
| .SH VERSIONS |
| The |
| .BR set_mempolicy (), |
| system call was added to the Linux kernel in version 2.6.7. |
| .SH CONFORMING TO |
| This system call is Linux-specific. |
| .SH NOTES |
| Memory policy is not remembered if the page is swapped out. |
| When such a page is paged back in, it will use the policy of |
| the thread or memory range that is in effect at the time the |
| page is allocated. |
| |
| For information on library support, see |
| .BR numa (7). |
| .SH SEE ALSO |
| .BR get_mempolicy (2), |
| .BR getcpu (2), |
| .BR mbind (2), |
| .BR mmap (2), |
| .BR numa (3), |
| .BR cpuset (7), |
| .BR numa (7), |
| .BR numactl (8) |