blob: 807e98f847bfd5cb4e8911921448d0ae5cceca51 [file]
The vulnerability is in the mlx5e driver, specifically in the `mlx5e_tx_reporter_dump_sq` function. This function casts its void pointer argument to a struct `mlx5e_txqsq *`, but in the TX timeout recovery flow, the argument is actually of type `struct mlx5e_tx_timeout_ctx *`. This mismatch can lead to a kernel stack overflow and eventually cause a fatal exception.
The issue arises when the `mlx5e_tx_reporter_dump_sq` function is called with an incorrect argument type. The function expects a pointer to a `struct mlx5e_txqsq`, but in certain situations, such as during TX timeout recovery, it receives a pointer to a `struct mlx5e_tx_timeout_ctx` instead. This causes the function to access memory outside of its intended bounds, leading to a kernel stack overflow.
To fix this bug, a wrapper function has been added to extract the `sq` from the `struct mlx5e_tx_timeout_ctx` and set it as the dump callback in the TX timeout recovery flow. This ensures that the correct argument type is used when calling the `mlx5e_tx_reporter_dump_sq` function.
The vulnerability was introduced in kernel version 5.7 with commit 5f29458b77d5 and has been fixed in versions 5.10.90, 5.15.13, and 5.16 with commits 73665165b64a, 07f13d58a8ec, and 918fc3855a65 respectively.