| From foo@baz Sun Nov 22 12:00:04 PM CET 2020 |
| From: Ido Schimmel <idosch@nvidia.com> |
| Date: Tue, 17 Nov 2020 19:33:52 +0200 |
| Subject: mlxsw: core: Use variable timeout for EMAD retries |
| |
| From: Ido Schimmel <idosch@nvidia.com> |
| |
| [ Upstream commit 1f492eab67bced119a0ac7db75ef2047e29a30c6 ] |
| |
| The driver sends Ethernet Management Datagram (EMAD) packets to the |
| device for configuration purposes and waits for up to 200ms for a reply. |
| A request is retried up to 5 times. |
| |
| When the system is under heavy load, replies are not always processed in |
| time and EMAD transactions fail. |
| |
| Make the process more robust to such delays by using exponential |
| backoff. First wait for up to 200ms, then retransmit and wait for up to |
| 400ms and so on. |
| |
| Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access") |
| Reported-by: Denis Yulevich <denisyu@nvidia.com> |
| Tested-by: Denis Yulevich <denisyu@nvidia.com> |
| Signed-off-by: Ido Schimmel <idosch@nvidia.com> |
| Reviewed-by: Jiri Pirko <jiri@nvidia.com> |
| Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| drivers/net/ethernet/mellanox/mlxsw/core.c | 3 ++- |
| 1 file changed, 2 insertions(+), 1 deletion(-) |
| |
| --- a/drivers/net/ethernet/mellanox/mlxsw/core.c |
| +++ b/drivers/net/ethernet/mellanox/mlxsw/core.c |
| @@ -439,7 +439,8 @@ static void mlxsw_emad_trans_timeout_sch |
| if (trans->core->fw_flash_in_progress) |
| timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_DURING_FW_FLASH_MS); |
| |
| - queue_delayed_work(trans->core->emad_wq, &trans->timeout_dw, timeout); |
| + queue_delayed_work(trans->core->emad_wq, &trans->timeout_dw, |
| + timeout << trans->retries); |
| } |
| |
| static int mlxsw_emad_transmit(struct mlxsw_core *mlxsw_core, |