- Split dm-bufio's rw_semaphore and rbtree. Offers improvements to
  dm-bufio's locking to allow increased concurrent IO -- particularly
  for read access for buffers already in dm-bufio's cache.

- Also split dm-bio-prison-v1's spinlock and rbtree with comparable
  aim at improving concurrent IO (for the DM thinp target).

- Both the dm-bufio and dm-bio-prison-v1 scaling of the number of
  locks and rbtrees used are managed by dm_num_hash_locks(). And the
  hash function used by both is dm_hash_locks_index().

- Allow DM targets to require DISCARD, WRITE_ZEROES and SECURE_ERASE
  to be split at the target specified boundary (in terms of
  max_discard_sectors, max_write_zeroes_sectors and
  max_secure_erase_sectors respectively).

- DM verity error handling fix for check_at_most_once on FEC.

- Update DM verity target to emit audit events on verification failure
  and more.

- DM core ->io_hints improvements needed in support of new discard
  support that is added to the DM "zero" and "error" targets.

- Fix missing kmem_cache_destroy() call in initialization error path
  of both the DM integrity and DM clone targets.

- A couple fixes for DM flakey, also add "error_reads" feature.

- Fix DM core's resume to not lock FS when the DM map is NULL;
  otherwise initial table load can race with FS mount that takes
  superblock's ->s_umount rw_semaphore.

- Various small improvements to both DM core and DM targets.
dm: don't lock fs when the map is NULL in process of resume

Commit fa247089de99 ("dm: requeue IO if mapping table not yet available")
added a detection of whether the mapping table is available in the IO
submission process. If the mapping table is unavailable, it returns
BLK_STS_RESOURCE and requeues the IO.
This can lead to the following deadlock problem:

dm create                                      mount
ioctl(DM_DEV_CREATE_CMD)
ioctl(DM_TABLE_LOAD_CMD)
                               do_mount
                                vfs_get_tree
                                 ext4_get_tree
                                  get_tree_bdev
                                   sget_fc
                                    alloc_super
                                     // got &s->s_umount
                                     down_write_nested(&s->s_umount, ...);
                                   ext4_fill_super
                                    ext4_load_super
                                     ext4_read_bh
                                      submit_bio
                                      // submit and wait io end
ioctl(DM_DEV_SUSPEND_CMD)
dev_suspend
 do_resume
  dm_suspend
   __dm_suspend
    lock_fs
     freeze_bdev
      get_active_super
       grab_super
        // wait for &s->s_umount
        down_write(&s->s_umount);
  dm_swap_table
   __bind
    // set md->map(can't get here)

IO will be continuously requeued while holding the lock since mapping
table is NULL. At the same time, mapping table won't be set since the
lock is not available.
Like request-based DM, bio-based DM also has the same problem.

It's not proper to just abort IO if the mapping table not available.
So clear DM_SKIP_LOCKFS_FLAG when the mapping table is NULL, this
allows the DM table to be loaded and the IO submitted upon resume.

Fixes: fa247089de99 ("dm: requeue IO if mapping table not yet available")
Cc: stable@vger.kernel.org
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
1 file changed