| From foo@baz Sat Jul 28 10:25:26 CEST 2018 |
| From: Uma Krishnan <ukrishn@linux.vnet.ibm.com> |
| Date: Mon, 26 Mar 2018 11:35:27 -0500 |
| Subject: scsi: cxlflash: Synchronize reset and remove ops |
| |
| From: Uma Krishnan <ukrishn@linux.vnet.ibm.com> |
| |
| [ Upstream commit a3feb6ef50def7c91244d7bd15a3625b7b49b81f ] |
| |
| The following Oops can be encountered if a device removal or system shutdown |
| is initiated while an EEH recovery is in process: |
| |
| [c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100 |
| [cxlflash] |
| [c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl] |
| [c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170 |
| [c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 |
| [c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580 |
| [c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 |
| [c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 |
| [c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0 |
| [c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 |
| |
| The remove handler frees AFU memory while the EEH recovery is in progress, |
| leading to a race condition. This can result in a crash if the recovery thread |
| tries to access this memory. |
| |
| To resolve this issue, the cxlflash remove handler will evaluate the device |
| state and yield to any active reset or probing threads. |
| |
| Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> |
| Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> |
| Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> |
| Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| drivers/scsi/cxlflash/main.c | 6 +++--- |
| 1 file changed, 3 insertions(+), 3 deletions(-) |
| |
| --- a/drivers/scsi/cxlflash/main.c |
| +++ b/drivers/scsi/cxlflash/main.c |
| @@ -946,9 +946,9 @@ static void cxlflash_remove(struct pci_d |
| return; |
| } |
| |
| - /* If a Task Management Function is active, wait for it to complete |
| - * before continuing with remove. |
| - */ |
| + /* Yield to running recovery threads before continuing with remove */ |
| + wait_event(cfg->reset_waitq, cfg->state != STATE_RESET && |
| + cfg->state != STATE_PROBING); |
| spin_lock_irqsave(&cfg->tmf_slock, lock_flags); |
| if (cfg->tmf_active) |
| wait_event_interruptible_lock_irq(cfg->tmf_waitq, |