| From a8a60e1eee8daa4747040e6dd5dd96726d958f6a Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Thu, 29 Oct 2020 14:30:48 -0700 |
| Subject: xfs: flush new eof page on truncate to avoid post-eof corruption |
| |
| From: Brian Foster <bfoster@redhat.com> |
| |
| [ Upstream commit 869ae85dae64b5540e4362d7fe4cd520e10ec05c ] |
| |
| It is possible to expose non-zeroed post-EOF data in XFS if the new |
| EOF page is dirty, backed by an unwritten block and the truncate |
| happens to race with writeback. iomap_truncate_page() will not zero |
| the post-EOF portion of the page if the underlying block is |
| unwritten. The subsequent call to truncate_setsize() will, but |
| doesn't dirty the page. Therefore, if writeback happens to complete |
| after iomap_truncate_page() (so it still sees the unwritten block) |
| but before truncate_setsize(), the cached page becomes inconsistent |
| with the on-disk block. A mapped read after the associated page is |
| reclaimed or invalidated exposes non-zero post-EOF data. |
| |
| For example, consider the following sequence when run on a kernel |
| modified to explicitly flush the new EOF page within the race |
| window: |
| |
| $ xfs_io -fc "falloc 0 4k" -c fsync /mnt/file |
| $ xfs_io -c "pwrite 0 4k" -c "truncate 1k" /mnt/file |
| ... |
| $ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file |
| 00000400: 00 00 00 00 00 00 00 00 ........ |
| $ umount /mnt/; mount <dev> /mnt/ |
| $ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file |
| 00000400: cd cd cd cd cd cd cd cd ........ |
| |
| Update xfs_setattr_size() to explicitly flush the new EOF page prior |
| to the page truncate to ensure iomap has the latest state of the |
| underlying block. |
| |
| Fixes: 68a9f5e7007c ("xfs: implement iomap based buffered write path") |
| Signed-off-by: Brian Foster <bfoster@redhat.com> |
| Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> |
| Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| fs/xfs/xfs_iops.c | 10 ++++++++++ |
| 1 file changed, 10 insertions(+) |
| |
| diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c |
| index 7bfddcd32d73e..0d587657056d8 100644 |
| --- a/fs/xfs/xfs_iops.c |
| +++ b/fs/xfs/xfs_iops.c |
| @@ -864,6 +864,16 @@ xfs_setattr_size( |
| if (newsize > oldsize) { |
| error = xfs_zero_eof(ip, newsize, oldsize, &did_zeroing); |
| } else { |
| + /* |
| + * iomap won't detect a dirty page over an unwritten block (or a |
| + * cow block over a hole) and subsequently skips zeroing the |
| + * newly post-EOF portion of the page. Flush the new EOF to |
| + * convert the block before the pagecache truncate. |
| + */ |
| + error = filemap_write_and_wait_range(inode->i_mapping, newsize, |
| + newsize); |
| + if (error) |
| + return error; |
| error = iomap_truncate_page(inode, newsize, &did_zeroing, |
| &xfs_iomap_ops); |
| } |
| -- |
| 2.27.0 |
| |