ghsa-vr5f-5953-gfj5
Vulnerability from github
Published
2024-04-28 15:30
Modified
2024-04-28 15:30
Details

In the Linux kernel, the following vulnerability has been resolved:

net/sched: taprio: avoid disabling offload when it was never enabled

In an incredibly strange API design decision, qdisc->destroy() gets called even if qdisc->init() never succeeded, not exclusively since commit 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation"), but apparently also earlier (in the case of qdisc_create_dflt()).

The taprio qdisc does not fully acknowledge this when it attempts full offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS parsed from netlink (in taprio_change(), tail called from taprio_init()).

But in taprio_destroy(), we call taprio_disable_offload(), and this determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).

But looking at the implementation of FULL_OFFLOAD_IS_ENABLED() (a bitwise check of bit 1 in q->flags), it is invalid to call this macro on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on an invalid set of flags.

As a result, it is possible to crash the kernel if user space forces an error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling of taprio_enable_offload(). This is because drivers do not expect the offload to be disabled when it was never enabled.

The error that we force here is to attach taprio as a non-root qdisc, but instead as child of an mqprio root qdisc:

$ tc qdisc add dev swp0 root handle 1: \ mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 $ tc qdisc replace dev swp0 parent 1:1 \ taprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \ sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \ flags 0x0 clockid CLOCK_TAI Unable to handle kernel paging request at virtual address fffffffffffffff8 [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Call trace: taprio_dump+0x27c/0x310 vsc9959_port_setup_tc+0x1f4/0x460 felix_port_setup_tc+0x24/0x3c dsa_slave_setup_tc+0x54/0x27c taprio_disable_offload.isra.0+0x58/0xe0 taprio_destroy+0x80/0x104 qdisc_create+0x240/0x470 tc_modify_qdisc+0x1fc/0x6b0 rtnetlink_rcv_msg+0x12c/0x390 netlink_rcv_skb+0x5c/0x130 rtnetlink_rcv+0x1c/0x2c

Fix this by keeping track of the operations we made, and undo the offload only if we actually did it.

I've added "bool offloaded" inside a 4 byte hole between "int clockid" and "atomic64_t picos_per_byte". Now the first cache line looks like below:

$ pahole -C taprio_sched net/sched/sch_taprio.o struct taprio_sched { struct Qdisc * * qdiscs; / 0 8 / struct Qdisc * root; / 8 8 / u32 flags; / 16 4 / enum tk_offsets tk_offset; / 20 4 / int clockid; / 24 4 / bool offloaded; / 28 1 /

    /* XXX 3 bytes hole, try to pack */

    atomic64_t                 picos_per_byte;       /*    32     0 */

    /* XXX 8 bytes hole, try to pack */

    spinlock_t                 current_entry_lock;   /*    40     0 */

    /* XXX 8 bytes hole, try to pack */

    struct sched_entry *       current_entry;        /*    48     8 */
    struct sched_gate_list *   oper_sched;           /*    56     8 */
    /* --- cacheline 1 boundary (64 bytes) --- */
Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2022-48644"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2024-04-28T13:15:07Z",
    "severity": null
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nnet/sched: taprio: avoid disabling offload when it was never enabled\n\nIn an incredibly strange API design decision, qdisc-\u003edestroy() gets\ncalled even if qdisc-\u003einit() never succeeded, not exclusively since\ncommit 87b60cfacf9f (\"net_sched: fix error recovery at qdisc creation\"),\nbut apparently also earlier (in the case of qdisc_create_dflt()).\n\nThe taprio qdisc does not fully acknowledge this when it attempts full\noffload, because it starts off with q-\u003eflags = TAPRIO_FLAGS_INVALID in\ntaprio_init(), then it replaces q-\u003eflags with TCA_TAPRIO_ATTR_FLAGS\nparsed from netlink (in taprio_change(), tail called from taprio_init()).\n\nBut in taprio_destroy(), we call taprio_disable_offload(), and this\ndetermines what to do based on FULL_OFFLOAD_IS_ENABLED(q-\u003eflags).\n\nBut looking at the implementation of FULL_OFFLOAD_IS_ENABLED()\n(a bitwise check of bit 1 in q-\u003eflags), it is invalid to call this macro\non q-\u003eflags when it contains TAPRIO_FLAGS_INVALID, because that is set\nto U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on\nan invalid set of flags.\n\nAs a result, it is possible to crash the kernel if user space forces an\nerror between setting q-\u003eflags = TAPRIO_FLAGS_INVALID, and the calling\nof taprio_enable_offload(). This is because drivers do not expect the\noffload to be disabled when it was never enabled.\n\nThe error that we force here is to attach taprio as a non-root qdisc,\nbut instead as child of an mqprio root qdisc:\n\n$ tc qdisc add dev swp0 root handle 1: \\\n\tmqprio num_tc 8 map 0 1 2 3 4 5 6 7 \\\n\tqueues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0\n$ tc qdisc replace dev swp0 parent 1:1 \\\n\ttaprio num_tc 8 map 0 1 2 3 4 5 6 7 \\\n\tqueues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \\\n\tsched-entry S 0x7f 990000 sched-entry S 0x80 100000 \\\n\tflags 0x0 clockid CLOCK_TAI\nUnable to handle kernel paging request at virtual address fffffffffffffff8\n[fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000\nInternal error: Oops: 96000004 [#1] PREEMPT SMP\nCall trace:\n taprio_dump+0x27c/0x310\n vsc9959_port_setup_tc+0x1f4/0x460\n felix_port_setup_tc+0x24/0x3c\n dsa_slave_setup_tc+0x54/0x27c\n taprio_disable_offload.isra.0+0x58/0xe0\n taprio_destroy+0x80/0x104\n qdisc_create+0x240/0x470\n tc_modify_qdisc+0x1fc/0x6b0\n rtnetlink_rcv_msg+0x12c/0x390\n netlink_rcv_skb+0x5c/0x130\n rtnetlink_rcv+0x1c/0x2c\n\nFix this by keeping track of the operations we made, and undo the\noffload only if we actually did it.\n\nI\u0027ve added \"bool offloaded\" inside a 4 byte hole between \"int clockid\"\nand \"atomic64_t picos_per_byte\". Now the first cache line looks like\nbelow:\n\n$ pahole -C taprio_sched net/sched/sch_taprio.o\nstruct taprio_sched {\n        struct Qdisc * *           qdiscs;               /*     0     8 */\n        struct Qdisc *             root;                 /*     8     8 */\n        u32                        flags;                /*    16     4 */\n        enum tk_offsets            tk_offset;            /*    20     4 */\n        int                        clockid;              /*    24     4 */\n        bool                       offloaded;            /*    28     1 */\n\n        /* XXX 3 bytes hole, try to pack */\n\n        atomic64_t                 picos_per_byte;       /*    32     0 */\n\n        /* XXX 8 bytes hole, try to pack */\n\n        spinlock_t                 current_entry_lock;   /*    40     0 */\n\n        /* XXX 8 bytes hole, try to pack */\n\n        struct sched_entry *       current_entry;        /*    48     8 */\n        struct sched_gate_list *   oper_sched;           /*    56     8 */\n        /* --- cacheline 1 boundary (64 bytes) --- */",
  "id": "GHSA-vr5f-5953-gfj5",
  "modified": "2024-04-28T15:30:29Z",
  "published": "2024-04-28T15:30:29Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2022-48644"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/586def6ebed195f3594a4884f7c5334d0e1ad1bb"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/c7c9c7eb305ab8b4e93e4e4e1b78d8cfcbc26323"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/d12a1eb07003e597077329767c6aa86a7e972c76"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/db46e3a88a09c5cf7e505664d01da7238cd56c92"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/f58e43184226e5e9662088ccf1389e424a3a4cbd"
    }
  ],
  "schema_version": "1.4.0",
  "severity": []
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading...

Loading...

Loading...

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.