ghsa-2fxm-8374-c95m
Vulnerability from github
In the Linux kernel, the following vulnerability has been resolved:
mm: swap: fix race between free_swap_and_cache() and swapoff()
There was previously a theoretical window where swapoff() could run and teardown a swap_info_struct while a call to free_swap_and_cache() was running in another thread. This could cause, amongst other bad possibilities, swap_page_trans_huge_swapped() (called by free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from a test case. But there has been agreement based on code review that this is possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall swapoff(). There was an extra check in _swap_info_get() to confirm that the swap entry was not free. This isn't present in get_swap_device() because it doesn't make sense in general due to the race between getting the reference and swapoff. So I've added an equivalent check directly in free_swap_and_cache().
Details of how to provoke one possible issue (thanks to David Hildenbrand for deriving this):
--8<-----
__swap_entry_free() might be the last user and result in "count == SWAP_HAS_CACHE".
swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0.
So the question is: could someone reclaim the folio and turn si->inuse_pages==0, before we completed swap_page_trans_huge_swapped().
Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are still references by swap entries.
Process 1 still references subpage 0 via swap entry. Process 2 still references subpage 1 via swap entry.
Process 1 quits. Calls free_swap_and_cache(). -> count == SWAP_HAS_CACHE [then, preempted in the hypervisor etc.]
Process 2 quits. Calls free_swap_and_cache(). -> count == SWAP_HAS_CACHE
Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls __try_to_reclaim_swap().
__try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()-> put_swap_folio()->free_swap_slot()->swapcache_free_entries()-> swap_entry_free()->swap_range_free()-> ... WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
What stops swapoff to succeed after process 2 reclaimed the swap cache but before process1 finished its call to swap_page_trans_huge_swapped()?
--8<-----
{ "affected": [], "aliases": [ "CVE-2024-26960" ], "database_specific": { "cwe_ids": [ "CWE-362" ], "github_reviewed": false, "github_reviewed_at": null, "nvd_published_at": "2024-05-01T06:15:12Z", "severity": "MODERATE" }, "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nmm: swap: fix race between free_swap_and_cache() and swapoff()\n\nThere was previously a theoretical window where swapoff() could run and\nteardown a swap_info_struct while a call to free_swap_and_cache() was\nrunning in another thread. This could cause, amongst other bad\npossibilities, swap_page_trans_huge_swapped() (called by\nfree_swap_and_cache()) to access the freed memory for swap_map.\n\nThis is a theoretical problem and I haven\u0027t been able to provoke it from a\ntest case. But there has been agreement based on code review that this is\npossible (see link below).\n\nFix it by using get_swap_device()/put_swap_device(), which will stall\nswapoff(). There was an extra check in _swap_info_get() to confirm that\nthe swap entry was not free. This isn\u0027t present in get_swap_device()\nbecause it doesn\u0027t make sense in general due to the race between getting\nthe reference and swapoff. So I\u0027ve added an equivalent check directly in\nfree_swap_and_cache().\n\nDetails of how to provoke one possible issue (thanks to David Hildenbrand\nfor deriving this):\n\n--8\u003c-----\n\n__swap_entry_free() might be the last user and result in\n\"count == SWAP_HAS_CACHE\".\n\nswapoff-\u003etry_to_unuse() will stop as soon as soon as si-\u003einuse_pages==0.\n\nSo the question is: could someone reclaim the folio and turn\nsi-\u003einuse_pages==0, before we completed swap_page_trans_huge_swapped().\n\nImagine the following: 2 MiB folio in the swapcache. Only 2 subpages are\nstill references by swap entries.\n\nProcess 1 still references subpage 0 via swap entry.\nProcess 2 still references subpage 1 via swap entry.\n\nProcess 1 quits. Calls free_swap_and_cache().\n-\u003e count == SWAP_HAS_CACHE\n[then, preempted in the hypervisor etc.]\n\nProcess 2 quits. Calls free_swap_and_cache().\n-\u003e count == SWAP_HAS_CACHE\n\nProcess 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls\n__try_to_reclaim_swap().\n\n__try_to_reclaim_swap()-\u003efolio_free_swap()-\u003edelete_from_swap_cache()-\u003e\nput_swap_folio()-\u003efree_swap_slot()-\u003eswapcache_free_entries()-\u003e\nswap_entry_free()-\u003eswap_range_free()-\u003e\n...\nWRITE_ONCE(si-\u003einuse_pages, si-\u003einuse_pages - nr_entries);\n\nWhat stops swapoff to succeed after process 2 reclaimed the swap cache\nbut before process1 finished its call to swap_page_trans_huge_swapped()?\n\n--8\u003c-----", "id": "GHSA-2fxm-8374-c95m", "modified": "2024-07-03T18:38:00Z", "published": "2024-05-01T06:31:42Z", "references": [ { "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-26960" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/0f98f6d2fb5fad00f8299b84b85b6bc1b6d7d19a" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/1ede7f1d7eed1738d1b9333fd1e152ccb450b86a" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/2da5568ee222ce0541bfe446a07998f92ed1643e" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/363d17e7f7907c8e27a9e86968af0eaa2301787b" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/3ce4c4c653e4e478ecb15d3c88e690f12cbf6b39" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/82b1c07a0af603e3c47b906c8e991dc96f01688e" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/d85c11c97ecf92d47a4b29e3faca714dc1f18d0d" }, { "type": "WEB", "url": "https://lists.debian.org/debian-lts-announce/2024/06/msg00017.html" } ], "schema_version": "1.4.0", "severity": [ { "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "type": "CVSS_V3" } ] }
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.