ghsa-5xmm-chg9-ppmr
Vulnerability from github
In the Linux kernel, the following vulnerability has been resolved:
nfs: fix UAF in direct writes
In production we have been hitting the following warning consistently
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcount_warn_saturate+0x9c/0xe0
Workqueue: nfsiod nfs_direct_write_schedule_work [nfs]
RIP: 0010:refcount_warn_saturate+0x9c/0xe0
PKRU: 55555554
Call Trace:
This is because we're completing the nfs_direct_request twice in a row.
The source of this is when we have our commit requests to submit, we process them and send them off, and then in the completion path for the commit requests we have
if (nfs_commit_end(cinfo.mds)) nfs_direct_write_complete(dreq);
However since we're submitting asynchronous requests we sometimes have one that completes before we submit the next one, so we end up calling complete on the nfs_direct_request twice.
The only other place we use nfs_generic_commit_list() is in __nfs_commit_inode, which wraps this call in a
nfs_commit_begin(); nfs_commit_end();
Which is a common pattern for this style of completion handling, one that is also repeated in the direct code with get_dreq()/put_dreq() calls around where we process events as well as in the completion paths.
Fix this by using the same pattern for the commit requests.
Before with my 200 node rocksdb stress running this warning would pop every 10ish minutes. With my patch the stress test has been running for several hours without popping.
{ "affected": [], "aliases": [ "CVE-2024-26958" ], "database_specific": { "cwe_ids": [], "github_reviewed": false, "github_reviewed_at": null, "nvd_published_at": "2024-05-01T06:15:12Z", "severity": null }, "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nnfs: fix UAF in direct writes\n\nIn production we have been hitting the following warning consistently\n\n------------[ cut here ]------------\nrefcount_t: underflow; use-after-free.\nWARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcount_warn_saturate+0x9c/0xe0\nWorkqueue: nfsiod nfs_direct_write_schedule_work [nfs]\nRIP: 0010:refcount_warn_saturate+0x9c/0xe0\nPKRU: 55555554\nCall Trace:\n \u003cTASK\u003e\n ? __warn+0x9f/0x130\n ? refcount_warn_saturate+0x9c/0xe0\n ? report_bug+0xcc/0x150\n ? handle_bug+0x3d/0x70\n ? exc_invalid_op+0x16/0x40\n ? asm_exc_invalid_op+0x16/0x20\n ? refcount_warn_saturate+0x9c/0xe0\n nfs_direct_write_schedule_work+0x237/0x250 [nfs]\n process_one_work+0x12f/0x4a0\n worker_thread+0x14e/0x3b0\n ? ZSTD_getCParams_internal+0x220/0x220\n kthread+0xdc/0x120\n ? __btf_name_valid+0xa0/0xa0\n ret_from_fork+0x1f/0x30\n\nThis is because we\u0027re completing the nfs_direct_request twice in a row.\n\nThe source of this is when we have our commit requests to submit, we\nprocess them and send them off, and then in the completion path for the\ncommit requests we have\n\nif (nfs_commit_end(cinfo.mds))\n\tnfs_direct_write_complete(dreq);\n\nHowever since we\u0027re submitting asynchronous requests we sometimes have\none that completes before we submit the next one, so we end up calling\ncomplete on the nfs_direct_request twice.\n\nThe only other place we use nfs_generic_commit_list() is in\n__nfs_commit_inode, which wraps this call in a\n\nnfs_commit_begin();\nnfs_commit_end();\n\nWhich is a common pattern for this style of completion handling, one\nthat is also repeated in the direct code with get_dreq()/put_dreq()\ncalls around where we process events as well as in the completion paths.\n\nFix this by using the same pattern for the commit requests.\n\nBefore with my 200 node rocksdb stress running this warning would pop\nevery 10ish minutes. With my patch the stress test has been running for\nseveral hours without popping.", "id": "GHSA-5xmm-chg9-ppmr", "modified": "2024-06-26T00:31:38Z", "published": "2024-05-01T06:31:42Z", "references": [ { "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-26958" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/17f46b803d4f23c66cacce81db35fef3adb8f2af" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/1daf52b5ffb24870fbeda20b4967526d8f9e12ab" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/3abc2d160ed8213948b147295d77d44a22c88fa3" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/4595d90b5d2ea5fa4d318d13f59055aa4bf3e7f5" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/80d24b308b7ee7037fc90d8ac99f6f78df0a256f" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/cf54f66e1dd78990ec6b32177bca7e6ea2144a95" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/e25447c35f8745337ea8bc0c9697fcac14df8605" }, { "type": "WEB", "url": "https://lists.debian.org/debian-lts-announce/2024/06/msg00017.html" } ], "schema_version": "1.4.0", "severity": [] }