| From 84c6ebc6e4ef7d4033a816d8368786c884d253d1 Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Fri, 14 Aug 2020 18:17:08 +0300 |
| Subject: ath10k: start recovery process when payload length exceeds max htc |
| length for sdio |
| |
| From: Wen Gong <wgong@codeaurora.org> |
| |
| [ Upstream commit 2fd3c8f34d08af0a6236085f9961866ad92ef9ec ] |
| |
| When simulate random transfer fail for sdio write and read, it happened |
| "payload length exceeds max htc length" and recovery later sometimes. |
| |
| Test steps: |
| 1. Add config and update kernel: |
| CONFIG_FAIL_MMC_REQUEST=y |
| CONFIG_FAULT_INJECTION=y |
| CONFIG_FAULT_INJECTION_DEBUG_FS=y |
| |
| 2. Run simulate fail: |
| cd /sys/kernel/debug/mmc1/fail_mmc_request |
| echo 10 > probability |
| echo 10 > times # repeat until hitting issues |
| |
| 3. It happened payload length exceeds max htc length. |
| [ 199.935506] ath10k_sdio mmc1:0001:1: payload length 57005 exceeds max htc length: 4088 |
| .... |
| [ 264.990191] ath10k_sdio mmc1:0001:1: payload length 57005 exceeds max htc length: 4088 |
| |
| 4. after some time, such as 60 seconds, it start recovery which triggered |
| by wmi command timeout for periodic scan. |
| [ 269.229232] ieee80211 phy0: Hardware restart was requested |
| [ 269.734693] ath10k_sdio mmc1:0001:1: device successfully recovered |
| |
| The simulate fail of sdio is not a real sdio transter fail, it only |
| set an error status in mmc_should_fail_request after the transfer end, |
| actually the transfer is success, then sdio_io_rw_ext_helper will |
| return error status and stop transfer the left data. For example, |
| the really RX len is 286 bytes, then it will split to 2 blocks in |
| sdio_io_rw_ext_helper, one is 256 bytes, left is 30 bytes, if the |
| first 256 bytes get an error status by mmc_should_fail_request,then |
| the left 30 bytes will not read in this RX operation. Then when the |
| next RX arrive, the left 30 bytes will be considered as the header |
| of the read, the top 4 bytes of the 30 bytes will be considered as |
| lookaheads, but actually the 4 bytes is not the lookaheads, so the len |
| from this lookaheads is not correct, it exceeds max htc length 4088 |
| sometimes. When happened exceeds, the buffer chain is not matched between |
| firmware and ath10k, then it need to start recovery ASAP. Recently then |
| recovery will be started by wmi command timeout, but it will be long time |
| later, for example, it is 60+ seconds later from the periodic scan, if |
| it does not have periodic scan, it will be longer. |
| |
| Start recovery when it happened "payload length exceeds max htc length" |
| will be reasonable. |
| |
| This patch only effect sdio chips. |
| |
| Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029. |
| |
| Signed-off-by: Wen Gong <wgong@codeaurora.org> |
| Signed-off-by: Kalle Valo <kvalo@codeaurora.org> |
| Link: https://lore.kernel.org/r/20200108031957.22308-3-wgong@codeaurora.org |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| drivers/net/wireless/ath/ath10k/sdio.c | 4 ++++ |
| 1 file changed, 4 insertions(+) |
| |
| diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c |
| index 63f882c690bff..0841e69b10b1a 100644 |
| --- a/drivers/net/wireless/ath/ath10k/sdio.c |
| +++ b/drivers/net/wireless/ath/ath10k/sdio.c |
| @@ -557,6 +557,10 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar, |
| le16_to_cpu(htc_hdr->len), |
| ATH10K_HTC_MBOX_MAX_PAYLOAD_LENGTH); |
| ret = -ENOMEM; |
| + |
| + queue_work(ar->workqueue, &ar->restart_work); |
| + ath10k_warn(ar, "exceeds length, start recovery\n"); |
| + |
| goto err; |
| } |
| |
| -- |
| 2.27.0 |
| |