ghsa-rh82-rm8r-5cfp
Vulnerability from github
Published
2024-08-17 12:30
Modified
2024-10-31 00:30
Details

In the Linux kernel, the following vulnerability has been resolved:

xdp: fix invalid wait context of page_pool_destroy()

If the driver uses a page pool, it creates a page pool with page_pool_create(). The reference count of page pool is 1 as default. A page pool will be destroyed only when a reference count reaches 0. page_pool_destroy() is used to destroy page pool, it decreases a reference count. When a page pool is destroyed, ->disconnect() is called, which is mem_allocator_disconnect(). This function internally acquires mutex_lock().

If the driver uses XDP, it registers a memory model with xdp_rxq_info_reg_mem_model(). The xdp_rxq_info_reg_mem_model() internally increases a page pool reference count if a memory model is a page pool. Now the reference count is 2.

To destroy a page pool, the driver should call both page_pool_destroy() and xdp_unreg_mem_model(). The xdp_unreg_mem_model() internally calls page_pool_destroy(). Only page_pool_destroy() decreases a reference count.

If a driver calls page_pool_destroy() then xdp_unreg_mem_model(), we will face an invalid wait context warning. Because xdp_unreg_mem_model() calls page_pool_destroy() with rcu_read_lock(). The page_pool_destroy() internally acquires mutex_lock().

Splat looks like:

[ BUG: Invalid wait context ] 6.10.0-rc6+ #4 Tainted: G W


ethtool/1806 is trying to lock: ffffffff90387b90 (mem_id_lock){+.+.}-{4:4}, at: mem_allocator_disconnect+0x73/0x150 other info that might help us debug this: context-{5:5} 3 locks held by ethtool/1806: stack backtrace: CPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 Call Trace: dump_stack_lvl+0x7e/0xc0 __lock_acquire+0x1681/0x4de0 ? _printk+0x64/0xe0 ? __pfx_mark_lock.part.0+0x10/0x10 ? __pfxlockacquire+0x10/0x10 lock_acquire+0x1b3/0x580 ? mem_allocator_disconnect+0x73/0x150 ? wake_up_klogd.part.0+0x16/0xc0 ? __pfx_lock_acquire+0x10/0x10 ? dump_stack_lvl+0x91/0xc0 __mutex_lock+0x15c/0x1690 ? mem_allocator_disconnect+0x73/0x150 ? __pfx_prb_read_valid+0x10/0x10 ? mem_allocator_disconnect+0x73/0x150 ? __pfx_llist_add_batch+0x10/0x10 ? console_unlock+0x193/0x1b0 ? lockdep_hardirqs_on+0xbe/0x140 ? __pfxmutexlock+0x10/0x10 ? tick_nohz_tick_stopped+0x16/0x90 ? irq_work_queue_local+0x1e5/0x330 ? irq_work_queue+0x39/0x50 ? __wake_up_klogd.part.0+0x79/0xc0 ? mem_allocator_disconnect+0x73/0x150 mem_allocator_disconnect+0x73/0x150 ? __pfx_mem_allocator_disconnect+0x10/0x10 ? mark_held_locks+0xa5/0xf0 ? rcu_is_watching+0x11/0xb0 page_pool_release+0x36e/0x6d0 page_pool_destroy+0xd7/0x440 xdp_unreg_mem_model+0x1a7/0x2a0 ? __pfx_xdp_unreg_mem_model+0x10/0x10 ? kfree+0x125/0x370 ? bnxt_free_ring.isra.0+0x2eb/0x500 ? bnxt_free_mem+0x5ac/0x2500 xdp_rxq_info_unreg+0x4a/0xd0 bnxt_free_mem+0x1356/0x2500 bnxt_close_nic+0xf0/0x3b0 ? __pfx_bnxt_close_nic+0x10/0x10 ? ethnl_parse_bit+0x2c6/0x6d0 ? __pfxnlavalidate_parse+0x10/0x10 ? pfx_ethnl_parse_bit+0x10/0x10 bnxt_set_features+0x2a8/0x3e0 __netdev_update_features+0x4dc/0x1370 ? ethnl_parse_bitset+0x4ff/0x750 ? __pfx_ethnl_parse_bitset+0x10/0x10 ? __pfxnetdevupdate_features+0x10/0x10 ? mark_held_locks+0xa5/0xf0 ? _raw_spin_unlock_irqrestore+0x42/0x70 ? pm_runtime_resume+0x7d/0x110 ethnl_set_features+0x32d/0xa20

To fix this problem, it uses rhashtable_lookup_fast() instead of rhashtable_lookup() with rcu_read_lock(). Using xa without rcu_read_lock() here is safe. xa is freed by __xdp_mem_allocator_rcu_free() and this is called by call_rcu() of mem_xa_remove(). The mem_xa_remove() is called by page_pool_destroy() if a reference count reaches 0. The xa is already protected by the reference count mechanism well in the control plane. So removing rcu_read_lock() for page_pool_destroy() is safe.

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2024-43834"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2024-08-17T10:15:09Z",
    "severity": "MODERATE"
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nxdp: fix invalid wait context of page_pool_destroy()\n\nIf the driver uses a page pool, it creates a page pool with\npage_pool_create().\nThe reference count of page pool is 1 as default.\nA page pool will be destroyed only when a reference count reaches 0.\npage_pool_destroy() is used to destroy page pool, it decreases a\nreference count.\nWhen a page pool is destroyed, -\u003edisconnect() is called, which is\nmem_allocator_disconnect().\nThis function internally acquires mutex_lock().\n\nIf the driver uses XDP, it registers a memory model with\nxdp_rxq_info_reg_mem_model().\nThe xdp_rxq_info_reg_mem_model() internally increases a page pool\nreference count if a memory model is a page pool.\nNow the reference count is 2.\n\nTo destroy a page pool, the driver should call both page_pool_destroy()\nand xdp_unreg_mem_model().\nThe xdp_unreg_mem_model() internally calls page_pool_destroy().\nOnly page_pool_destroy() decreases a reference count.\n\nIf a driver calls page_pool_destroy() then xdp_unreg_mem_model(), we\nwill face an invalid wait context warning.\nBecause xdp_unreg_mem_model() calls page_pool_destroy() with\nrcu_read_lock().\nThe page_pool_destroy() internally acquires mutex_lock().\n\nSplat looks like:\n=============================\n[ BUG: Invalid wait context ]\n6.10.0-rc6+ #4 Tainted: G W\n-----------------------------\nethtool/1806 is trying to lock:\nffffffff90387b90 (mem_id_lock){+.+.}-{4:4}, at: mem_allocator_disconnect+0x73/0x150\nother info that might help us debug this:\ncontext-{5:5}\n3 locks held by ethtool/1806:\nstack backtrace:\nCPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed\nHardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021\nCall Trace:\n\u003cTASK\u003e\ndump_stack_lvl+0x7e/0xc0\n__lock_acquire+0x1681/0x4de0\n? _printk+0x64/0xe0\n? __pfx_mark_lock.part.0+0x10/0x10\n? __pfx___lock_acquire+0x10/0x10\nlock_acquire+0x1b3/0x580\n? mem_allocator_disconnect+0x73/0x150\n? __wake_up_klogd.part.0+0x16/0xc0\n? __pfx_lock_acquire+0x10/0x10\n? dump_stack_lvl+0x91/0xc0\n__mutex_lock+0x15c/0x1690\n? mem_allocator_disconnect+0x73/0x150\n? __pfx_prb_read_valid+0x10/0x10\n? mem_allocator_disconnect+0x73/0x150\n? __pfx_llist_add_batch+0x10/0x10\n? console_unlock+0x193/0x1b0\n? lockdep_hardirqs_on+0xbe/0x140\n? __pfx___mutex_lock+0x10/0x10\n? tick_nohz_tick_stopped+0x16/0x90\n? __irq_work_queue_local+0x1e5/0x330\n? irq_work_queue+0x39/0x50\n? __wake_up_klogd.part.0+0x79/0xc0\n? mem_allocator_disconnect+0x73/0x150\nmem_allocator_disconnect+0x73/0x150\n? __pfx_mem_allocator_disconnect+0x10/0x10\n? mark_held_locks+0xa5/0xf0\n? rcu_is_watching+0x11/0xb0\npage_pool_release+0x36e/0x6d0\npage_pool_destroy+0xd7/0x440\nxdp_unreg_mem_model+0x1a7/0x2a0\n? __pfx_xdp_unreg_mem_model+0x10/0x10\n? kfree+0x125/0x370\n? bnxt_free_ring.isra.0+0x2eb/0x500\n? bnxt_free_mem+0x5ac/0x2500\nxdp_rxq_info_unreg+0x4a/0xd0\nbnxt_free_mem+0x1356/0x2500\nbnxt_close_nic+0xf0/0x3b0\n? __pfx_bnxt_close_nic+0x10/0x10\n? ethnl_parse_bit+0x2c6/0x6d0\n? __pfx___nla_validate_parse+0x10/0x10\n? __pfx_ethnl_parse_bit+0x10/0x10\nbnxt_set_features+0x2a8/0x3e0\n__netdev_update_features+0x4dc/0x1370\n? ethnl_parse_bitset+0x4ff/0x750\n? __pfx_ethnl_parse_bitset+0x10/0x10\n? __pfx___netdev_update_features+0x10/0x10\n? mark_held_locks+0xa5/0xf0\n? _raw_spin_unlock_irqrestore+0x42/0x70\n? __pm_runtime_resume+0x7d/0x110\nethnl_set_features+0x32d/0xa20\n\nTo fix this problem, it uses rhashtable_lookup_fast() instead of\nrhashtable_lookup() with rcu_read_lock().\nUsing xa without rcu_read_lock() here is safe.\nxa is freed by __xdp_mem_allocator_rcu_free() and this is called by\ncall_rcu() of mem_xa_remove().\nThe mem_xa_remove() is called by page_pool_destroy() if a reference\ncount reaches 0.\nThe xa is already protected by the reference count mechanism well in the\ncontrol plane.\nSo removing rcu_read_lock() for page_pool_destroy() is safe.",
  "id": "GHSA-rh82-rm8r-5cfp",
  "modified": "2024-10-31T00:30:36Z",
  "published": "2024-08-17T12:30:32Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-43834"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/12144069209eec7f2090ce9afa15acdcc2c2a537"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/3fc1be360b99baeea15cdee3cf94252cd3a72d26"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/59a931c5b732ca5fc2ca727f5a72aeabaafa85ec"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/6c390ef198aa69795427a5cb5fd7cb4bc7e6cd7a"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/be9d08ff102df3ac4f66e826ea935cf3af63a4bd"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/bf0ce5aa5f2525ed1b921ba36de96e458e77f482"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
      "type": "CVSS_V3"
    }
  ]
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading...

Loading...

Loading...
  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.