| From bippy-1.2.0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@kernel.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2025-37988: fix a couple of races in MNT_TREE_BENEATH handling by do_move_mount() |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| fix a couple of races in MNT_TREE_BENEATH handling by do_move_mount() |
| |
| Normally do_lock_mount(path, _) is locking a mountpoint pinned by |
| *path and at the time when matching unlock_mount() unlocks that |
| location it is still pinned by the same thing. |
| |
| Unfortunately, for 'beneath' case it's no longer that simple - |
| the object being locked is not the one *path points to. It's the |
| mountpoint of path->mnt. The thing is, without sufficient locking |
| ->mnt_parent may change under us and none of the locks are held |
| at that point. The rules are |
| * mount_lock stabilizes m->mnt_parent for any mount m. |
| * namespace_sem stabilizes m->mnt_parent, provided that |
| m is mounted. |
| * if either of the above holds and refcount of m is positive, |
| we are guaranteed the same for refcount of m->mnt_parent. |
| |
| namespace_sem nests inside inode_lock(), so do_lock_mount() has |
| to take inode_lock() before grabbing namespace_sem. It does |
| recheck that path->mnt is still mounted in the same place after |
| getting namespace_sem, and it does take care to pin the dentry. |
| It is needed, since otherwise we might end up with racing mount --move |
| (or umount) happening while we were getting locks; in that case |
| dentry would no longer be a mountpoint and could've been evicted |
| on memory pressure along with its inode - not something you want |
| when grabbing lock on that inode. |
| |
| However, pinning a dentry is not enough - the matching mount is |
| also pinned only by the fact that path->mnt is mounted on top it |
| and at that point we are not holding any locks whatsoever, so |
| the same kind of races could end up with all references to |
| that mount gone just as we are about to enter inode_lock(). |
| If that happens, we are left with filesystem being shut down while |
| we are holding a dentry reference on it; results are not pretty. |
| |
| What we need to do is grab both dentry and mount at the same time; |
| that makes inode_lock() safe *and* avoids the problem with fs getting |
| shut down under us. After taking namespace_sem we verify that |
| path->mnt is still mounted (which stabilizes its ->mnt_parent) and |
| check that it's still mounted at the same place. From that point |
| on to the matching namespace_unlock() we are guaranteed that |
| mount/dentry pair we'd grabbed are also pinned by being the mountpoint |
| of path->mnt, so we can quietly drop both the dentry reference (as |
| the current code does) and mnt one - it's OK to do under namespace_sem, |
| since we are not dropping the final refs. |
| |
| That solves the problem on do_lock_mount() side; unlock_mount() |
| also has one, since dentry is guaranteed to stay pinned only until |
| the namespace_unlock(). That's easy to fix - just have inode_unlock() |
| done earlier, while it's still pinned by mp->m_dentry. |
| |
| The Linux kernel CVE team has assigned CVE-2025-37988 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.5 with commit 6ac392815628f317fcfdca1a39df00b9cc4ebc8b and fixed in 6.6.89 with commit 4f435c1f4c48ff84968e2d9159f6fa41f46cf998 |
| Issue introduced in 6.5 with commit 6ac392815628f317fcfdca1a39df00b9cc4ebc8b and fixed in 6.12.26 with commit a61afd54826ac24c2c93845c4f441dbc344875b1 |
| Issue introduced in 6.5 with commit 6ac392815628f317fcfdca1a39df00b9cc4ebc8b and fixed in 6.14.5 with commit d4b21e8cd3d7efa2deb9cff534f0133e84f35086 |
| Issue introduced in 6.5 with commit 6ac392815628f317fcfdca1a39df00b9cc4ebc8b and fixed in 6.15 with commit 0d039eac6e5950f9d1ecc9e410c2fd1feaeab3b6 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2025-37988 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| fs/namespace.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/4f435c1f4c48ff84968e2d9159f6fa41f46cf998 |
| https://git.kernel.org/stable/c/a61afd54826ac24c2c93845c4f441dbc344875b1 |
| https://git.kernel.org/stable/c/d4b21e8cd3d7efa2deb9cff534f0133e84f35086 |
| https://git.kernel.org/stable/c/0d039eac6e5950f9d1ecc9e410c2fd1feaeab3b6 |