| From 95a7b64f82d0f8f7967cbbeabefbea8fe3f04ad9 Mon Sep 17 00:00:00 2001 |
| From: Josef Bacik <josef@toxicpanda.com> |
| Date: Mon, 21 Oct 2019 15:56:28 -0400 |
| Subject: [PATCH] nbd: handle racing with error'ed out commands |
| |
| commit 7ce23e8e0a9cd38338fc8316ac5772666b565ca9 upstream. |
| |
| We hit the following warning in production |
| |
| print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700 |
| ------------[ cut here ]------------ |
| refcount_t: underflow; use-after-free. |
| WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60 |
| Workqueue: knbd-recv recv_work [nbd] |
| RIP: 0010:refcount_sub_and_test_checked+0x53/0x60 |
| Call Trace: |
| blk_mq_free_request+0xb7/0xf0 |
| blk_mq_complete_request+0x62/0xf0 |
| recv_work+0x29/0xa1 [nbd] |
| process_one_work+0x1f5/0x3f0 |
| worker_thread+0x2d/0x3d0 |
| ? rescuer_thread+0x340/0x340 |
| kthread+0x111/0x130 |
| ? kthread_create_on_node+0x60/0x60 |
| ret_from_fork+0x1f/0x30 |
| ---[ end trace b079c3c67f98bb7c ]--- |
| |
| This was preceded by us timing out everything and shutting down the |
| sockets for the device. The problem is we had a request in the queue at |
| the same time, so we completed the request twice. This can actually |
| happen in a lot of cases, we fail to get a ref on our config, we only |
| have one connection and just error out the command, etc. |
| |
| Fix this by checking cmd->status in nbd_read_stat. We only change this |
| under the cmd->lock, so we are safe to check this here and see if we've |
| already error'ed this command out, which would indicate that we've |
| completed it as well. |
| |
| Reviewed-by: Mike Christie <mchristi@redhat.com> |
| Signed-off-by: Josef Bacik <josef@toxicpanda.com> |
| |
| Signed-off-by: Jens Axboe <axboe@kernel.dk> |
| Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> |
| |
| diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c |
| index 76c04edb1b67..80fe62139a9c 100644 |
| --- a/drivers/block/nbd.c |
| +++ b/drivers/block/nbd.c |
| @@ -671,6 +671,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) |
| ret = -ENOENT; |
| goto out; |
| } |
| + if (cmd->status != BLK_STS_OK) { |
| + dev_err(disk_to_dev(nbd->disk), "Command already handled %p\n", |
| + req); |
| + ret = -ENOENT; |
| + goto out; |
| + } |
| if (test_bit(NBD_CMD_REQUEUED, &cmd->flags)) { |
| dev_err(disk_to_dev(nbd->disk), "Raced with timeout on req %p\n", |
| req); |
| -- |
| 2.7.4 |
| |