ghsa-p8xf-2w27-6wqx
Vulnerability from github
In the Linux kernel, the following vulnerability has been resolved:
af_unix: Clear stale u->oob_skb.
syzkaller started to report deadlock of unix_gc_lock after commit 4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but it just uncovers the bug that has been there since commit 314001f0bf92 ("af_unix: Add OOB support").
The repro basically does the following.
from socket import * from array import array
c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB) c2.recv(1) # blocked as no normal data in recv queue
c2.close() # done async and unblock recv() c1.close() # done async and trigger GC
A socket sends its file descriptor to itself as OOB data and tries to receive normal data, but finally recv() fails due to async close().
The problem here is wrong handling of OOB skb in manage_oob(). When recvmsg() is called without MSG_OOB, manage_oob() is called to check if the peeked skb is OOB skb. In such a case, manage_oob() pops it out of the receive queue but does not clear unix_sock(sk)->oob_skb. This is wrong in terms of uAPI.
Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB. The 'o' is handled as OOB data. When recv() is called twice without MSG_OOB, the OOB data should be lost.
from socket import * c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0) c1.send(b'hello', MSG_OOB) # 'o' is OOB data 5 c1.send(b'world') 5 c2.recv(5) # OOB data is not received b'hell' c2.recv(5) # OOB date is skipped b'world' c2.recv(5, MSG_OOB) # This should return an error b'o'
In the same situation, TCP actually returns -EINVAL for the last recv().
Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set EPOLLPRI even though the data has passed through by previous recv().
To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing it from recv queue.
The reason why the old GC did not trigger the deadlock is because the old GC relied on the receive queue to detect the loop.
When it is triggered, the socket with OOB data is marked as GC candidate because file refcount == inflight count (1). However, after traversing all inflight sockets, the socket still has a positive inflight count (1), thus the socket is excluded from candidates. Then, the old GC lose the chance to garbage-collect the socket.
With the old GC, the repro continues to create true garbage that will never be freed nor detected by kmemleak as it's linked to the global inflight list. That's why we couldn't even notice the issue.
{ "affected": [], "aliases": [ "CVE-2024-35970" ], "database_specific": { "cwe_ids": [], "github_reviewed": false, "github_reviewed_at": null, "nvd_published_at": "2024-05-20T10:15:11Z", "severity": "MODERATE" }, "details": "In the Linux kernel, the following vulnerability has been resolved:\n\naf_unix: Clear stale u-\u003eoob_skb.\n\nsyzkaller started to report deadlock of unix_gc_lock after commit\n4090fa373f0e (\"af_unix: Replace garbage collection algorithm.\"), but\nit just uncovers the bug that has been there since commit 314001f0bf92\n(\"af_unix: Add OOB support\").\n\nThe repro basically does the following.\n\n from socket import *\n from array import array\n\n c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)\n c1.sendmsg([b\u0027a\u0027], [(SOL_SOCKET, SCM_RIGHTS, array(\"i\", [c2.fileno()]))], MSG_OOB)\n c2.recv(1) # blocked as no normal data in recv queue\n\n c2.close() # done async and unblock recv()\n c1.close() # done async and trigger GC\n\nA socket sends its file descriptor to itself as OOB data and tries to\nreceive normal data, but finally recv() fails due to async close().\n\nThe problem here is wrong handling of OOB skb in manage_oob(). When\nrecvmsg() is called without MSG_OOB, manage_oob() is called to check\nif the peeked skb is OOB skb. In such a case, manage_oob() pops it\nout of the receive queue but does not clear unix_sock(sk)-\u003eoob_skb.\nThis is wrong in terms of uAPI.\n\nLet\u0027s say we send \"hello\" with MSG_OOB, and \"world\" without MSG_OOB.\nThe \u0027o\u0027 is handled as OOB data. When recv() is called twice without\nMSG_OOB, the OOB data should be lost.\n\n \u003e\u003e\u003e from socket import *\n \u003e\u003e\u003e c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0)\n \u003e\u003e\u003e c1.send(b\u0027hello\u0027, MSG_OOB) # \u0027o\u0027 is OOB data\n 5\n \u003e\u003e\u003e c1.send(b\u0027world\u0027)\n 5\n \u003e\u003e\u003e c2.recv(5) # OOB data is not received\n b\u0027hell\u0027\n \u003e\u003e\u003e c2.recv(5) # OOB date is skipped\n b\u0027world\u0027\n \u003e\u003e\u003e c2.recv(5, MSG_OOB) # This should return an error\n b\u0027o\u0027\n\nIn the same situation, TCP actually returns -EINVAL for the last\nrecv().\n\nAlso, if we do not clear unix_sk(sk)-\u003eoob_skb, unix_poll() always set\nEPOLLPRI even though the data has passed through by previous recv().\n\nTo avoid these issues, we must clear unix_sk(sk)-\u003eoob_skb when dequeuing\nit from recv queue.\n\nThe reason why the old GC did not trigger the deadlock is because the\nold GC relied on the receive queue to detect the loop.\n\nWhen it is triggered, the socket with OOB data is marked as GC candidate\nbecause file refcount == inflight count (1). However, after traversing\nall inflight sockets, the socket still has a positive inflight count (1),\nthus the socket is excluded from candidates. Then, the old GC lose the\nchance to garbage-collect the socket.\n\nWith the old GC, the repro continues to create true garbage that will\nnever be freed nor detected by kmemleak as it\u0027s linked to the global\ninflight list. That\u0027s why we couldn\u0027t even notice the issue.", "id": "GHSA-p8xf-2w27-6wqx", "modified": "2024-11-01T21:31:46Z", "published": "2024-05-20T12:30:28Z", "references": [ { "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-35970" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/601a89ea24d05089debfa2dc896ea9f5937ac7a6" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/698a95ade1a00e6494482046902b986dfffd1caf" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/84a352b7eba1142a95441380058985ff19f25ec9" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/b46f4eaa4f0ec38909fb0072eea3aeddb32f954e" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/b4bc99d04c689b5652665394ae8d3e02fb754153" } ], "schema_version": "1.4.0", "severity": [ { "score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L", "type": "CVSS_V3" } ] }
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.