Handle huge pages in ibv_fork_init() and madvise tracking

When fork support is enabled in libibverbs, madvise() is called for
every memory page that is registered as a memory region.  Memory
ranges that are passed to madvise() must be page aligned and the size
must be a multiple of the page size.

libibverbs uses sysconf(_SC_PAGESIZE) to find out the system page size
and rounds all ranges passed to reg_mr() according to this page size.
When memory from libhugetlbfs is passed to reg_mr(), this does not
work as the page size for this memory range might be different
(e.g. 16MB).  So libibverbs would have to use the huge page size to
calculate a page aligned range for madvise.

As huge pages are provided to the application "under the hood" when
preloading libhugetlbfs, the application does not have any knowledge
about when it registers a huge page or a usual page.

To work around this issue, detect the use of huge pages in libibverbs
and align memory ranges passed to madvise according to the huge page
size.  Determining the page size of a given memory range by watching
madvise() fail has proven to be unreliable.  So we introduce the
RDMAV_HUGEPAGES_SAFE environment variable to let the user decide if
the page size should be checked on every reg_mr() call or not. This
requires the user to be aware if huge pages are used by the running
application or not.

I did not add an aditional API call to enable this, as applications
can use setenv() + ibv_fork_init() to enable checking for huge pages
in the code.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>

[ Updated ibv_fork_init() manpage for RDMAV_HUGEPAGES_SAFE.  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2 files changed