xfs: recheck appropriateness of map_shared lock
While fuzzing the data fork extent count on a btree-format directory
with xfs/375, I observed the following (excerpted) splat:
XFS: Assertion failed: xfs_isilocked(ip, XFS_ILOCK_EXCL), file: fs/xfs/libxfs/xfs_bmap.c, line: 1208
------------[ cut here ]------------
WARNING: CPU: 0 PID: 43192 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs]
Call Trace:
<TASK>
xfs_iread_extents+0x1af/0x210 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xchk_dir_walk+0xb8/0x190 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xchk_parent_count_parent_dentries+0x41/0x80 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xchk_parent_validate+0x199/0x2e0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xchk_parent+0xdf/0x130 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xfs_scrub_metadata+0x2b8/0x730 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xfs_scrubv_metadata+0x38b/0x4d0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xfs_ioc_scrubv_metadata+0x111/0x160 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
xfs_file_ioctl+0x367/0xf50 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
__x64_sys_ioctl+0x82/0xa0
do_syscall_64+0x2b/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
The cause of this is a race condition in xfs_ilock_data_map_shared,
which performs an unlocked access to the data fork to guess which lock
mode it needs:
Thread 0 Thread 1
xfs_need_iread_extents
<observe no iext tree>
xfs_ilock(..., ILOCK_EXCL)
xfs_iread_extents
<observe no iext tree>
<check ILOCK_EXCL>
<load bmbt extents into iext>
<notice iext size doesn't
match nextents>
xfs_need_iread_extents
<observe iext tree>
xfs_ilock(..., ILOCK_SHARED)
<tear down iext tree>
xfs_iunlock(..., ILOCK_EXCL)
xfs_iread_extents
<observe no iext tree>
<check ILOCK_EXCL>
*BOOM*
mitigate this race by having thread 1 to recheck xfs_need_iread_extents
after taking the shared ILOCK. If the iext tree isn't present, then we
need to upgrade to the exclusive ILOCK to try to load the bmbt.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index d354ea2..6ce1e0e 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -117,6 +117,20 @@ xfs_ilock_data_map_shared(
if (xfs_need_iread_extents(&ip->i_df))
lock_mode = XFS_ILOCK_EXCL;
xfs_ilock(ip, lock_mode);
+
+ /*
+ * It's possible that the unlocked access of the data fork to determine
+ * the lock mode could have raced with another thread that was failing
+ * to load the bmbt but hadn't yet torn down the iext tree. Recheck
+ * the lock mode and upgrade to an exclusive lock if we need to.
+ */
+ if (lock_mode == XFS_ILOCK_SHARED &&
+ xfs_need_iread_extents(&ip->i_df)) {
+ xfs_iunlock(ip, lock_mode);
+ lock_mode = XFS_ILOCK_EXCL;
+ xfs_ilock(ip, lock_mode);
+ }
+
return lock_mode;
}
@@ -129,6 +143,21 @@ xfs_ilock_attr_map_shared(
if (xfs_inode_has_attr_fork(ip) && xfs_need_iread_extents(&ip->i_af))
lock_mode = XFS_ILOCK_EXCL;
xfs_ilock(ip, lock_mode);
+
+ /*
+ * It's possible that the unlocked access of the attr fork to determine
+ * the lock mode could have raced with another thread that was failing
+ * to load the bmbt but hadn't yet torn down the iext tree. Recheck
+ * the lock mode and upgrade to an exclusive lock if we need to.
+ */
+ if (lock_mode == XFS_ILOCK_SHARED &&
+ xfs_inode_has_attr_fork(ip) &&
+ xfs_need_iread_extents(&ip->i_af)) {
+ xfs_iunlock(ip, lock_mode);
+ lock_mode = XFS_ILOCK_EXCL;
+ xfs_ilock(ip, lock_mode);
+ }
+
return lock_mode;
}