fs, xfs: introduce MAP_DIRECT for creating block-map-atomic file ranges
MAP_DIRECT is an mmap(2) flag with the following semantics:
MAP_DIRECT
When specified with MAP_SHARED a successful fault in this range
indicates that the kernel is maintaining the block map (user linear
address to file offset to physical address relationship) in a manner
that no external agent can observe any inconsistent changes. In other
words, the block map of the mapping is effectively pinned, or the kernel
is otherwise able to exchange a new physical extent atomically with
respect to any hardware / software agent. As implied by this definition
a successful fault in a MAP_DIRECT range bypasses kernel indirections
like the page-cache, and all updates are carried directly through to the
underlying file physical blocks (modulo cpu cache effects).
ETXTBSY may be returned to any third party operation on the file that
attempts to update the block map (allocate blocks / convert unwritten
extents / break shared extents). However, whether a filesystem returns
EXTBSY for a certain state of the block relative to a MAP_DIRECT mapping
is filesystem and kernel version dependent.
Some filesystems may extend these operation restrictions outside the
mapped range and return ETXTBSY to any file operations that might mutate
the block map. MAP_DIRECT faults may fail with a SIGBUS if the
filesystem needs to write the block map to satisfy the fault. For
example, if the mapping was established over a hole in a sparse file.
ERRORS
EACCES A MAP_DIRECT mapping was requested and PROT_WRITE was not set,
or the requesting process is missing CAP_LINUX_IMMUTABLE.
EINVAL MAP_ANONYMOUS or MAP_PRIVATE was specified with MAP_DIRECT.
EOPNOTSUPP The filesystem explicitly does not support the flag
SIGBUS Attempted to write a MAP_DIRECT mapping at a file offset that
might require block-map updates.
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
6 files changed