| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-43880: mlxsw: spectrum_acl_erp: Fix object nesting warning |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| mlxsw: spectrum_acl_erp: Fix object nesting warning |
| |
| ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM |
| (A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can |
| contain more ACLs (i.e., tc filters), but the number of masks in each |
| region (i.e., tc chain) is limited. |
| |
| In order to mitigate the effects of the above limitation, the device |
| allows filters to share a single mask if their masks only differ in up |
| to 8 consecutive bits. For example, dst_ip/25 can be represented using |
| dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the |
| number of masks being used (and therefore does not support mask |
| aggregation), but can contain a limited number of filters. |
| |
| The driver uses the "objagg" library to perform the mask aggregation by |
| passing it objects that consist of the filter's mask and whether the |
| filter is to be inserted into the A-TCAM or the C-TCAM since filters in |
| different TCAMs cannot share a mask. |
| |
| The set of created objects is dependent on the insertion order of the |
| filters and is not necessarily optimal. Therefore, the driver will |
| periodically ask the library to compute a more optimal set ("hints") by |
| looking at all the existing objects. |
| |
| When the library asks the driver whether two objects can be aggregated |
| the driver only compares the provided masks and ignores the A-TCAM / |
| C-TCAM indication. This is the right thing to do since the goal is to |
| move as many filters as possible to the A-TCAM. The driver also forbids |
| two identical masks from being aggregated since this can only happen if |
| one was intentionally put in the C-TCAM to avoid a conflict in the |
| A-TCAM. |
| |
| The above can result in the following set of hints: |
| |
| H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta |
| H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta |
| |
| After getting the hints from the library the driver will start migrating |
| filters from one region to another while consulting the computed hints |
| and instructing the device to perform a lookup in both regions during |
| the transition. |
| |
| Assuming a filter with mask X is being migrated into the A-TCAM in the |
| new region, the hints lookup will return H1. Since H2 is the parent of |
| H1, the library will try to find the object associated with it and |
| create it if necessary in which case another hints lookup (recursive) |
| will be performed. This hints lookup for {mask Y, A-TCAM} will either |
| return H2 or H3 since the driver passes the library an object comparison |
| function that ignores the A-TCAM / C-TCAM indication. |
| |
| This can eventually lead to nested objects which are not supported by |
| the library [1]. |
| |
| Fix by removing the object comparison function from both the driver and |
| the library as the driver was the only user. That way the lookup will |
| only return exact matches. |
| |
| I do not have a reliable reproducer that can reproduce the issue in a |
| timely manner, but before the fix the issue would reproduce in several |
| minutes and with the fix it does not reproduce in over an hour. |
| |
| Note that the current usefulness of the hints is limited because they |
| include the C-TCAM indication and represent aggregation that cannot |
| actually happen. This will be addressed in net-next. |
| |
| [1] |
| WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0 |
| Modules linked in: |
| CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42 |
| Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018 |
| Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work |
| RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0 |
| [...] |
| Call Trace: |
| <TASK> |
| __objagg_obj_get+0x2bb/0x580 |
| objagg_obj_get+0xe/0x80 |
| mlxsw_sp_acl_erp_mask_get+0xb5/0xf0 |
| mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0 |
| mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0 |
| mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270 |
| mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510 |
| process_one_work+0x151/0x370 |
| |
| The Linux kernel CVE team has assigned CVE-2024-43880 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 5.4.282 with commit 4dc09f6f260db3c4565a4ec52ba369393598f2fb |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 5.10.224 with commit 36a9996e020dd5aa325e0ecc55eb2328288ea6bb |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 5.15.165 with commit 9a5261a984bba4f583d966c550fa72c33ff3714e |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 6.1.103 with commit 25c6fd9648ad05da493a5d30881896a78a08b624 |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 6.6.44 with commit 0e59c2d22853266704e127915653598f7f104037 |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 6.10.3 with commit fb5d4fc578e655d113f09565f6f047e15f7ab578 |
| Issue introduced in 5.1 with commit 9069a3817d82b01b3a55da382c774e3575946130 and fixed in 6.11 with commit 97d833ceb27dc19f8777d63f90be4a27b5daeedf |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-43880 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_erp.c |
| include/linux/objagg.h |
| lib/objagg.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/4dc09f6f260db3c4565a4ec52ba369393598f2fb |
| https://git.kernel.org/stable/c/36a9996e020dd5aa325e0ecc55eb2328288ea6bb |
| https://git.kernel.org/stable/c/9a5261a984bba4f583d966c550fa72c33ff3714e |
| https://git.kernel.org/stable/c/25c6fd9648ad05da493a5d30881896a78a08b624 |
| https://git.kernel.org/stable/c/0e59c2d22853266704e127915653598f7f104037 |
| https://git.kernel.org/stable/c/fb5d4fc578e655d113f09565f6f047e15f7ab578 |
| https://git.kernel.org/stable/c/97d833ceb27dc19f8777d63f90be4a27b5daeedf |