81380ca10778b99dce98940cfc993214712df335 - pub/scm/linux/kernel/git/minchan/linux

commit	81380ca10778b99dce98940cfc993214712df335	[log] [tgz]
author	Omar Sandoval <osandov@fb.com>	Fri Apr 07 08:56:26 2017 -0600
committer	Jens Axboe <axboe@fb.com>	Fri Apr 07 08:56:26 2017 -0600
tree	ce167a1c5f914d91949af26cfb4f38d964d3daa5
parent	8945927cb7cc5953cc1554053452f50248159c88 [diff]

blk-mq: use the right hctx when getting a driver tag fails

While dispatching requests, if we fail to get a driver tag, we mark the
hardware queue as waiting for a tag and put the requests on a
hctx->dispatch list to be run later when a driver tag is freed. However,
blk_mq_dispatch_rq_list() may dispatch requests from multiple hardware
queues if using a single-queue scheduler with a multiqueue device. If
blk_mq_get_driver_tag() fails, it doesn't update the hardware queue we
are processing. This means we end up using the hardware queue of the
previous request, which may or may not be the same as that of the
current request. If it isn't, the wrong hardware queue will end up
waiting for a tag, and the requests will be on the wrong dispatch list,
leading to a hang.

The fix is twofold:

1. Make sure we save which hardware queue we were trying to get a
   request for in blk_mq_get_driver_tag() regardless of whether it
   succeeds or not.
2. Make blk_mq_dispatch_rq_list() take a request_queue instead of a
   blk_mq_hw_queue to make it clear that it must handle multiple
   hardware queues, since I've already messed this up on a couple of
   occasions.

This didn't appear in testing with nvme and mq-deadline because nvme has
more driver tags than the default number of scheduler tags. However,
with the blk_mq_update_nr_hw_queues() fix, it showed up with nbd.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

3 files changed

tree: ce167a1c5f914d91949af26cfb4f38d964d3daa5