fs, xfs: introduce MAP_DIRECT for creating block-map-atomic file ranges

MAP_DIRECT is an mmap(2) flag with the following semantics:

  MAP_DIRECT
  When specified with MAP_SHARED a successful fault in this range
  indicates that the kernel is maintaining the block map (user linear
  address to file offset to physical address relationship) in a manner
  that no external agent can observe any inconsistent changes. In other
  words, the block map of the mapping is effectively pinned, or the kernel
  is otherwise able to exchange a new physical extent atomically with
  respect to any hardware / software agent. As implied by this definition
  a successful fault in a MAP_DIRECT range bypasses kernel indirections
  like the page-cache, and all updates are carried directly through to the
  underlying file physical blocks (modulo cpu cache effects).

  ETXTBSY may be returned to any third party operation on the file that
  attempts to update the block map (allocate blocks / convert unwritten
  extents / break shared extents). However, whether a filesystem returns
  EXTBSY for a certain state of the block relative to a MAP_DIRECT mapping
  is filesystem and kernel version dependent.

  Some filesystems may extend these operation restrictions outside the
  mapped range and return ETXTBSY to any file operations that might mutate
  the block map. MAP_DIRECT faults may fail with a SIGBUS if the
  filesystem needs to write the block map to satisfy the fault. For
  example, if the mapping was established over a hole in a sparse file.

  ERRORS
  EACCES A MAP_DIRECT mapping was requested and PROT_WRITE was not set,
  or the requesting process is missing CAP_LINUX_IMMUTABLE.

  EINVAL MAP_ANONYMOUS or MAP_PRIVATE was specified with MAP_DIRECT.

  EOPNOTSUPP The filesystem explicitly does not support the flag

  SIGBUS Attempted to write a MAP_DIRECT mapping at a file offset that
         might require block-map updates.

Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
6 files changed