| From 860fde32e5ff083a98b0f871310648d97f5019fb Mon Sep 17 00:00:00 2001 |
| From: Sreekanth Reddy <sreekanth.reddy@broadcom.com> |
| Date: Fri, 27 Mar 2020 05:52:43 -0400 |
| Subject: [PATCH] scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug |
| |
| commit cc41f11a21a51d6869d71e525a7264c748d7c0d7 upstream. |
| |
| Generic protection fault type kernel panic is observed when user performs |
| soft (ordered) HBA unplug operation while IOs are running on drives |
| connected to HBA. |
| |
| When user performs ordered HBA removal operation, the kernel calls PCI |
| device's .remove() call back function where driver is flushing out all the |
| outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps |
| sg buffers allocated for these IO commands. |
| |
| However, in the ordered HBA removal case (unlike of real HBA hot removal), |
| HBA device is still alive and hence HBA hardware is performing the DMA |
| operations to those buffers on the system memory which are already unmapped |
| while flushing out the outstanding SCSI IO commands and this leads to |
| kernel panic. |
| |
| Don't flush out the outstanding IOs from .remove() path in case of ordered |
| removal since HBA will be still alive in this case and it can complete the |
| outstanding IOs. Flush out the outstanding IOs only in case of 'physical |
| HBA hot unplug' where there won't be any communication with the HBA. |
| |
| During shutdown also it is possible that HBA hardware can perform DMA |
| operations on those outstanding IO buffers which are completed with |
| DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied |
| in shutdown path as well. |
| |
| It is safe to drop the outstanding commands when HBA is inaccessible such |
| as when permanent PCI failure happens, when HBA is in non-operational |
| state, or when someone does a real HBA hot unplug operation. Since driver |
| knows that HBA is inaccessible during these cases, it is safe to drop the |
| outstanding commands instead of waiting for SCSI error recovery to kick in |
| and clear these outstanding commands. |
| |
| Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com |
| Fixes: c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload") |
| Cc: stable@vger.kernel.org #v4.14.174+ |
| Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> |
| Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> |
| Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> |
| |
| diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c |
| index 1ccfbc7eebe0..4c5a8e460b50 100644 |
| --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c |
| +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c |
| @@ -9673,8 +9673,8 @@ static void scsih_remove(struct pci_dev *pdev) |
| |
| ioc->remove_host = 1; |
| |
| - mpt3sas_wait_for_commands_to_complete(ioc); |
| - _scsih_flush_running_cmds(ioc); |
| + if (!pci_device_is_present(pdev)) |
| + _scsih_flush_running_cmds(ioc); |
| |
| _scsih_fw_event_cleanup_queue(ioc); |
| |
| @@ -9750,8 +9750,8 @@ scsih_shutdown(struct pci_dev *pdev) |
| |
| ioc->remove_host = 1; |
| |
| - mpt3sas_wait_for_commands_to_complete(ioc); |
| - _scsih_flush_running_cmds(ioc); |
| + if (!pci_device_is_present(pdev)) |
| + _scsih_flush_running_cmds(ioc); |
| |
| _scsih_fw_event_cleanup_queue(ioc); |
| |
| -- |
| 2.7.4 |
| |