- Small correctness fix in del_gendisk() if GENHD_FL_HIDDEN is used.

- Cleanup blk_unregister_queue() to more precisely protect against
  concurrent sysfs changes, blk_mq_unregister_dev() now requires caller
  to hold q->sysfslock (blk_unregister_queue is only caller).

- Introduce add_disk() variant, add_disk_no_queue_reg(), that allows the
  gendisk to be registered but the associated disk->queue's
  blk_register_queue() is left for the driver to do once its
  request_queue is fully initialized.  Fixes long-standing DM
  request_queue initialization issues.

- Ming's blk-mq improvements to blk_insert_cloned_request(), which is
  used exclusively by request-based DM's blk-mq mode, that enable
  substantial dm-mpath sequential IO performance improvements.
blk-mq: issue request directly for blk_insert_cloned_request

blk_insert_cloned_request() is called in fast path of dm-rq driver, and
in this function we append request to hctx->dispatch_list of the underlying
queue directly.

1) This way isn't efficient enough because hctx lock is always required

2) With blk_insert_cloned_request(), we bypass underlying queue's IO
scheduler totally, and depend on DM rq driver to do IO schedule
completely.  But DM rq driver can't get underlying queue's dispatch
feedback at all, and this information is extreamly useful for IO merge.
Without that IO merge can't be done basically by blk-mq, which causes
very bad sequential IO performance.

Fix this by having blk_insert_cloned_request() make use of
blk_mq_try_issue_directly() via blk_mq_request_direct_issue().
blk_mq_request_direct_issue() allows a request to be dispatched to be
issue directly to the underlying queue and provides dispatch result to
dm-rq and blk-mq.

With this, the DM's blk-mq sequential IO performance is vastly
improved (as much as 3X in mpath/virtio-scsi testing).

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 files changed