| From foo@baz Sat Mar 5 02:46:44 PM CET 2022 |
| From: Ye Bin <yebin10@huawei.com> |
| Date: Mon, 29 Nov 2021 09:26:59 +0800 |
| Subject: block: Fix fsync always failed if once failed |
| |
| From: Ye Bin <yebin10@huawei.com> |
| |
| commit 8a7518931baa8ea023700987f3db31cb0a80610b upstream. |
| |
| We do test with inject error fault base on v4.19, after test some time we found |
| sync /dev/sda always failed. |
| [root@localhost] sync /dev/sda |
| sync: error syncing '/dev/sda': Input/output error |
| |
| scsi log as follows: |
| [19069.812296] sd 0:0:0:0: [sda] tag#64 Send: scmd 0x00000000d03a0b6b |
| [19069.812302] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 |
| [19069.812533] sd 0:0:0:0: [sda] tag#64 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK |
| [19069.812536] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 |
| [19069.812539] sd 0:0:0:0: [sda] tag#64 scsi host busy 1 failed 0 |
| [19069.812542] sd 0:0:0:0: Notifying upper driver of completion (result 0) |
| [19069.812546] sd 0:0:0:0: [sda] tag#64 sd_done: completed 0 of 0 bytes |
| [19069.812549] sd 0:0:0:0: [sda] tag#64 0 sectors total, 0 bytes done. |
| [19069.812564] print_req_error: I/O error, dev sda, sector 0 |
| |
| ftrace log as follows: |
| rep-306069 [007] .... 19654.923315: block_bio_queue: 8,0 FWS 0 + 0 [rep] |
| rep-306069 [007] .... 19654.923333: block_getrq: 8,0 FWS 0 + 0 [rep] |
| kworker/7:1H-250 [007] .... 19654.923352: block_rq_issue: 8,0 FF 0 () 0 + 0 [kworker/7:1H] |
| <idle>-0 [007] ..s. 19654.923562: block_rq_complete: 8,0 FF () 18446744073709551615 + 0 [0] |
| <idle>-0 [007] d.s. 19654.923576: block_rq_complete: 8,0 WS () 0 + 0 [-5] |
| |
| As 8d6996630c03 introduce 'fq->rq_status', this data only update when 'flush_rq' |
| reference count isn't zero. If flush request once failed and record error code |
| in 'fq->rq_status'. If there is no chance to update 'fq->rq_status',then do fsync |
| will always failed. |
| To address this issue reset 'fq->rq_status' after return error code to upper layer. |
| |
| Fixes: 8d6996630c03("block: fix null pointer dereference in blk_mq_rq_timed_out()") |
| Signed-off-by: Ye Bin <yebin10@huawei.com> |
| Reviewed-by: Ming Lei <ming.lei@redhat.com> |
| Link: https://lore.kernel.org/r/20211129012659.1553733-1-yebin10@huawei.com |
| Signed-off-by: Jens Axboe <axboe@kernel.dk> |
| [sudip: adjust context] |
| Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| block/blk-flush.c | 4 +++- |
| 1 file changed, 3 insertions(+), 1 deletion(-) |
| |
| --- a/block/blk-flush.c |
| +++ b/block/blk-flush.c |
| @@ -222,8 +222,10 @@ static void flush_end_io(struct request |
| return; |
| } |
| |
| - if (fq->rq_status != BLK_STS_OK) |
| + if (fq->rq_status != BLK_STS_OK) { |
| error = fq->rq_status; |
| + fq->rq_status = BLK_STS_OK; |
| + } |
| |
| hctx = flush_rq->mq_hctx; |
| if (!q->elevator) { |