ghsa-qvfw-rqcg-jh9g
Vulnerability from github
Published
2024-07-17 09:30
Modified
2024-07-29 09:36
Details

In the Linux kernel, the following vulnerability has been resolved:

bpf: Fix overrunning reservations in ringbuf

The BPF ring buffer internally is implemented as a power-of-2 sized circular buffer, with two logical and ever-increasing counters: consumer_pos is the consumer counter to show which logical position the consumer consumed the data, and producer_pos which is the producer counter denoting the amount of data reserved by all producers.

Each time a record is reserved, the producer that "owns" the record will successfully advance producer counter. In user space each time a record is read, the consumer of the data advanced the consumer counter once it finished processing. Both counters are stored in separate pages so that from user space, the producer counter is read-only and the consumer counter is read-write.

One aspect that simplifies and thus speeds up the implementation of both producers and consumers is how the data area is mapped twice contiguously back-to-back in the virtual memory, allowing to not take any special measures for samples that have to wrap around at the end of the circular buffer data area, because the next page after the last data page would be first data page again, and thus the sample will still appear completely contiguous in virtual memory.

Each record has a struct bpf_ringbuf_hdr { u32 len; u32 pg_off; } header for book-keeping the length and offset, and is inaccessible to the BPF program. Helpers like bpf_ringbuf_reserve() return (void *)hdr + BPF_RINGBUF_HDR_SZ for the BPF program to use. Bing-Jhong and Muhammad reported that it is however possible to make a second allocated memory chunk overlapping with the first chunk and as a result, the BPF program is now able to edit first chunk's header.

For example, consider the creation of a BPF_MAP_TYPE_RINGBUF map with size of 0x4000. Next, the consumer_pos is modified to 0x3000 /before/ a call to bpf_ringbuf_reserve() is made. This will allocate a chunk A, which is in [0x0,0x3008], and the BPF program is able to edit [0x8,0x3008]. Now, lets allocate a chunk B with size 0x3000. This will succeed because consumer_pos was edited ahead of time to pass the new_prod_pos - cons_pos > rb->mask check. Chunk B will be in range [0x3008,0x6010], and the BPF program is able to edit [0x3010,0x6010]. Due to the ring buffer memory layout mentioned earlier, the ranges [0x0,0x4000] and [0x4000,0x8000] point to the same data pages. This means that chunk B at [0x4000,0x4008] is chunk A's header. bpf_ringbuf_submit() / bpf_ringbuf_discard() use the header's pg_off to then locate the bpf_ringbuf itself via bpf_ringbuf_restore_from_rec(). Once chunk B modified chunk A's header, then bpf_ringbuf_commit() refers to the wrong page and could cause a crash.

Fix it by calculating the oldest pending_pos and check whether the range from the oldest outstanding record to the newest would span beyond the ring buffer size. If that is the case, then reject the request. We've tested with the ring buffer benchmark in BPF selftests (./benchs/run_bench_ringbufs.sh) before/after the fix and while it seems a bit slower on some benchmarks, it is still not significantly enough to matter.

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2024-41009"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-770"
    ],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2024-07-17T07:15:01Z",
    "severity": "MODERATE"
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nbpf: Fix overrunning reservations in ringbuf\n\nThe BPF ring buffer internally is implemented as a power-of-2 sized circular\nbuffer, with two logical and ever-increasing counters: consumer_pos is the\nconsumer counter to show which logical position the consumer consumed the\ndata, and producer_pos which is the producer counter denoting the amount of\ndata reserved by all producers.\n\nEach time a record is reserved, the producer that \"owns\" the record will\nsuccessfully advance producer counter. In user space each time a record is\nread, the consumer of the data advanced the consumer counter once it finished\nprocessing. Both counters are stored in separate pages so that from user\nspace, the producer counter is read-only and the consumer counter is read-write.\n\nOne aspect that simplifies and thus speeds up the implementation of both\nproducers and consumers is how the data area is mapped twice contiguously\nback-to-back in the virtual memory, allowing to not take any special measures\nfor samples that have to wrap around at the end of the circular buffer data\narea, because the next page after the last data page would be first data page\nagain, and thus the sample will still appear completely contiguous in virtual\nmemory.\n\nEach record has a struct bpf_ringbuf_hdr { u32 len; u32 pg_off; } header for\nbook-keeping the length and offset, and is inaccessible to the BPF program.\nHelpers like bpf_ringbuf_reserve() return `(void *)hdr + BPF_RINGBUF_HDR_SZ`\nfor the BPF program to use. Bing-Jhong and Muhammad reported that it is however\npossible to make a second allocated memory chunk overlapping with the first\nchunk and as a result, the BPF program is now able to edit first chunk\u0027s\nheader.\n\nFor example, consider the creation of a BPF_MAP_TYPE_RINGBUF map with size\nof 0x4000. Next, the consumer_pos is modified to 0x3000 /before/ a call to\nbpf_ringbuf_reserve() is made. This will allocate a chunk A, which is in\n[0x0,0x3008], and the BPF program is able to edit [0x8,0x3008]. Now, lets\nallocate a chunk B with size 0x3000. This will succeed because consumer_pos\nwas edited ahead of time to pass the `new_prod_pos - cons_pos \u003e rb-\u003emask`\ncheck. Chunk B will be in range [0x3008,0x6010], and the BPF program is able\nto edit [0x3010,0x6010]. Due to the ring buffer memory layout mentioned\nearlier, the ranges [0x0,0x4000] and [0x4000,0x8000] point to the same data\npages. This means that chunk B at [0x4000,0x4008] is chunk A\u0027s header.\nbpf_ringbuf_submit() / bpf_ringbuf_discard() use the header\u0027s pg_off to then\nlocate the bpf_ringbuf itself via bpf_ringbuf_restore_from_rec(). Once chunk\nB modified chunk A\u0027s header, then bpf_ringbuf_commit() refers to the wrong\npage and could cause a crash.\n\nFix it by calculating the oldest pending_pos and check whether the range\nfrom the oldest outstanding record to the newest would span beyond the ring\nbuffer size. If that is the case, then reject the request. We\u0027ve tested with\nthe ring buffer benchmark in BPF selftests (./benchs/run_bench_ringbufs.sh)\nbefore/after the fix and while it seems a bit slower on some benchmarks, it\nis still not significantly enough to matter.",
  "id": "GHSA-qvfw-rqcg-jh9g",
  "modified": "2024-07-29T09:36:13Z",
  "published": "2024-07-17T09:30:47Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-41009"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/0f98f40eb1ed52af8b81f61901b6c0289ff59de4"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/47416c852f2a04d348ea66ee451cbdcf8119f225"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/511804ab701c0503b72eac08217eabfd366ba069"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/be35504b959f2749bab280f4671e8df96dcf836f"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/cfa1a2329a691ffd991fcf7248a57d752e712881"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/d1b9df0435bc61e0b44f578846516df8ef476686"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
      "type": "CVSS_V3"
    }
  ]
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading...

Loading...

Loading...

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.