| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-35970: af_unix: Clear stale u->oob_skb. |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| af_unix: Clear stale u->oob_skb. |
| |
| syzkaller started to report deadlock of unix_gc_lock after commit |
| 4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but |
| it just uncovers the bug that has been there since commit 314001f0bf92 |
| ("af_unix: Add OOB support"). |
| |
| The repro basically does the following. |
| |
| from socket import * |
| from array import array |
| |
| c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) |
| c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB) |
| c2.recv(1) # blocked as no normal data in recv queue |
| |
| c2.close() # done async and unblock recv() |
| c1.close() # done async and trigger GC |
| |
| A socket sends its file descriptor to itself as OOB data and tries to |
| receive normal data, but finally recv() fails due to async close(). |
| |
| The problem here is wrong handling of OOB skb in manage_oob(). When |
| recvmsg() is called without MSG_OOB, manage_oob() is called to check |
| if the peeked skb is OOB skb. In such a case, manage_oob() pops it |
| out of the receive queue but does not clear unix_sock(sk)->oob_skb. |
| This is wrong in terms of uAPI. |
| |
| Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB. |
| The 'o' is handled as OOB data. When recv() is called twice without |
| MSG_OOB, the OOB data should be lost. |
| |
| >>> from socket import * |
| >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0) |
| >>> c1.send(b'hello', MSG_OOB) # 'o' is OOB data |
| 5 |
| >>> c1.send(b'world') |
| 5 |
| >>> c2.recv(5) # OOB data is not received |
| b'hell' |
| >>> c2.recv(5) # OOB date is skipped |
| b'world' |
| >>> c2.recv(5, MSG_OOB) # This should return an error |
| b'o' |
| |
| In the same situation, TCP actually returns -EINVAL for the last |
| recv(). |
| |
| Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set |
| EPOLLPRI even though the data has passed through by previous recv(). |
| |
| To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing |
| it from recv queue. |
| |
| The reason why the old GC did not trigger the deadlock is because the |
| old GC relied on the receive queue to detect the loop. |
| |
| When it is triggered, the socket with OOB data is marked as GC candidate |
| because file refcount == inflight count (1). However, after traversing |
| all inflight sockets, the socket still has a positive inflight count (1), |
| thus the socket is excluded from candidates. Then, the old GC lose the |
| chance to garbage-collect the socket. |
| |
| With the old GC, the repro continues to create true garbage that will |
| never be freed nor detected by kmemleak as it's linked to the global |
| inflight list. That's why we couldn't even notice the issue. |
| |
| The Linux kernel CVE team has assigned CVE-2024-35970 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.15 with commit 314001f0bf927015e459c9d387d62a231fe93af3 and fixed in 5.15.156 with commit b4bc99d04c689b5652665394ae8d3e02fb754153 |
| Issue introduced in 5.15 with commit 314001f0bf927015e459c9d387d62a231fe93af3 and fixed in 6.1.87 with commit 84a352b7eba1142a95441380058985ff19f25ec9 |
| Issue introduced in 5.15 with commit 314001f0bf927015e459c9d387d62a231fe93af3 and fixed in 6.6.28 with commit 601a89ea24d05089debfa2dc896ea9f5937ac7a6 |
| Issue introduced in 5.15 with commit 314001f0bf927015e459c9d387d62a231fe93af3 and fixed in 6.8.7 with commit 698a95ade1a00e6494482046902b986dfffd1caf |
| Issue introduced in 5.15 with commit 314001f0bf927015e459c9d387d62a231fe93af3 and fixed in 6.9 with commit b46f4eaa4f0ec38909fb0072eea3aeddb32f954e |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-35970 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| net/unix/af_unix.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/b4bc99d04c689b5652665394ae8d3e02fb754153 |
| https://git.kernel.org/stable/c/84a352b7eba1142a95441380058985ff19f25ec9 |
| https://git.kernel.org/stable/c/601a89ea24d05089debfa2dc896ea9f5937ac7a6 |
| https://git.kernel.org/stable/c/698a95ade1a00e6494482046902b986dfffd1caf |
| https://git.kernel.org/stable/c/b46f4eaa4f0ec38909fb0072eea3aeddb32f954e |