refs/tags/xfs-5.8-merge-8 - pub/scm/fs/xfs/xfs-linux

tag	69d2f928bd9bb13f2e133834de0614aa39859227
tagger	Darrick J. Wong <darrick.wong@oracle.com>	Wed May 27 08:50:40 2020 -0700
object	6dcde60efd946e38fac8d276a6ca47492103e856

New code for 5.8:
    - Various cleanups to remove dead code, unnecessary conditionals,
      asserts, etc.
    - Fix a linker warning caused by xfs stuffing '-g' into CFLAGS
      redundantly.
    - Tighten up our dmesg logging to ensure that everything is prefixed
      with 'XFS' for easier grepping.
    - Kill a bunch of typedefs.
    - Refactor the deferred ops code to reduce indirect function calls.
    - Increase type-safety with the deferred ops code.
    - Make the DAX mount options a tri-state.
    - Fix some error handling problems in the inode flush code and clean up
      other inode flush warts.
    - Refactor log recovery so that each log item recovery functions now live
      with the other log item processing code.
    - Fix some SPDX forms.
    - Fix quota counter corruption if the fs crashes after running
      quotacheck but before any dquots get logged.
    - Don't fail metadata verification on zero-entry attr leaf blocks, since
      they're just part of the disk format now due to a historic lack of log
      atomicity.
    - Don't allow SWAPEXT between files with different [ugp]id when quotas
      are enabled.
    - Refactor inode fork reading and verification to run directly from the
      inode-from-disk function.  This means that we now actually guarantee
      that _iget'ted inodes are totally verified and ready to go.
    - Move the incore inode fork format and extent counts to the ifork
      structure.
    - Scalability improvements by reducing cacheline pingponging in
      struct xfs_mount.
    - More scalability improvements by removing m_active_trans from the
      hot path.
    - Fix inode counter update sanity checking to run /only/ on debug
      kernels.
    - Fix longstanding inconsistency in what error code we return when a
      program hits project quota limits (ENOSPC).
    - Fix group quota returning the wrong error code when a program hits
      group quota limits.
    - Fix per-type quota limits and grace periods for group and project
      quotas so that they actually work.
    - Allow extension of individual grace periods.
    - Refactor the non-reclaim inode radix tree walking code to remove a
      bunch of stupid little functions and straighten out the
      inconsistent naming schemes.
    - Fix a bug in speculative preallocation where we measured a new
      allocation based on the last extent mapping in the file instead of
      looking farther for the last contiguous space allocation.
    - Force delalloc writes to unwritten extents.  This closes a
      stale disk contents exposure vector if the system goes down before
      the write completes.
    - More lockdep whackamole.

commit	6dcde60efd946e38fac8d276a6ca47492103e856	[log] [tgz]
author	Darrick J. Wong <darrick.wong@oracle.com>	Tue May 26 09:33:11 2020 -0700
committer	Darrick J. Wong <darrick.wong@oracle.com>	Wed May 27 08:49:28 2020 -0700
tree	4a4057909a7b4046d948ec7a6d26ae4691d66b87
parent	a5949d3faedf492fa7863b914da408047ab46eb0 [diff]

xfs: more lockdep whackamole with kmem_alloc*

Dave Airlie reported the following lockdep complaint:

>  ======================================================
>  WARNING: possible circular locking dependency detected
>  5.7.0-0.rc5.20200515git1ae7efb38854.1.fc33.x86_64 #1 Not tainted
>  ------------------------------------------------------
>  kswapd0/159 is trying to acquire lock:
>  ffff9b38d01a4470 (&xfs_nondir_ilock_class){++++}-{3:3},
>  at: xfs_ilock+0xde/0x2c0 [xfs]
>
>  but task is already holding lock:
>  ffffffffbbb8bd00 (fs_reclaim){+.+.}-{0:0}, at:
>  __fs_reclaim_acquire+0x5/0x30
>
>  which lock already depends on the new lock.
>
>
>  the existing dependency chain (in reverse order) is:
>
>  -> #1 (fs_reclaim){+.+.}-{0:0}:
>         fs_reclaim_acquire+0x34/0x40
>         __kmalloc+0x4f/0x270
>         kmem_alloc+0x93/0x1d0 [xfs]
>         kmem_alloc_large+0x4c/0x130 [xfs]
>         xfs_attr_copy_value+0x74/0xa0 [xfs]
>         xfs_attr_get+0x9d/0xc0 [xfs]
>         xfs_get_acl+0xb6/0x200 [xfs]
>         get_acl+0x81/0x160
>         posix_acl_xattr_get+0x3f/0xd0
>         vfs_getxattr+0x148/0x170
>         getxattr+0xa7/0x240
>         path_getxattr+0x52/0x80
>         do_syscall_64+0x5c/0xa0
>         entry_SYSCALL_64_after_hwframe+0x49/0xb3
>
>  -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
>         __lock_acquire+0x1257/0x20d0
>         lock_acquire+0xb0/0x310
>         down_write_nested+0x49/0x120
>         xfs_ilock+0xde/0x2c0 [xfs]
>         xfs_reclaim_inode+0x3f/0x400 [xfs]
>         xfs_reclaim_inodes_ag+0x20b/0x410 [xfs]
>         xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
>         super_cache_scan+0x190/0x1e0
>         do_shrink_slab+0x184/0x420
>         shrink_slab+0x182/0x290
>         shrink_node+0x174/0x680
>         balance_pgdat+0x2d0/0x5f0
>         kswapd+0x21f/0x510
>         kthread+0x131/0x150
>         ret_from_fork+0x3a/0x50
>
>  other info that might help us debug this:
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(fs_reclaim);
>                                 lock(&xfs_nondir_ilock_class);
>                                 lock(fs_reclaim);
>    lock(&xfs_nondir_ilock_class);
>
>   *** DEADLOCK ***
>
>  4 locks held by kswapd0/159:
>   #0: ffffffffbbb8bd00 (fs_reclaim){+.+.}-{0:0}, at:
>  __fs_reclaim_acquire+0x5/0x30
>   #1: ffffffffbbb7cef8 (shrinker_rwsem){++++}-{3:3}, at:
>  shrink_slab+0x115/0x290
>   #2: ffff9b39f07a50e8
>  (&type->s_umount_key#56){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0
>   #3: ffff9b39f077f258
>  (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at:
>  xfs_reclaim_inodes_ag+0x82/0x410 [xfs]

This is a known false positive because inodes cannot simultaneously be
getting reclaimed and the target of a getxattr operation, but lockdep
doesn't know that.  We can (selectively) shut up lockdep until either
it gets smarter or we change inode reclaim not to require the ILOCK by
applying a stupid GFP_NOLOCKDEP bandaid.

Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Dave Airlie <airlied@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>

2 files changed

tree: 4a4057909a7b4046d948ec7a6d26ae4691d66b87