ghsa-p59w-9gqw-wj8r
Vulnerability from github
Published
2024-01-31 18:04
Modified
2024-11-22 17:56
Summary
Label Studio SSRF on Import Bypassing `SSRF_PROTECTION_ENABLED` Protections
Details

Introduction

This write-up describes a vulnerability found in Label Studio, a popular open source data labeling tool. The vulnerability affects all versions of Label Studio prior to 1.11.0 and was tested on version 1.8.2.

Overview

Label Studio's SSRF protections that can be enabled by setting the SSRF_PROTECTION_ENABLED environment variable can be bypassed to access internal web servers. This is because the current SSRF validation is done by executing a single DNS lookup to verify that the IP address is not in an excluded subnet range. This protection can be bypassed by either using HTTP redirection or performing a DNS rebinding attack.

Description

The following tasks_from_url method in label_studio/data_import/uploader.py performs the SSRF validation (validate_upload_url) before sending the request.

```python def tasks_from_url(file_upload_ids, project, user, url, could_be_tasks_list): """Download file using URL and read tasks from it""" # process URL with tasks try: filename = url.rsplit('/', 1)[-1]

    validate_upload_url(url, block_local_urls=settings.SSRF_PROTECTION_ENABLED)
    # Reason for #nosec: url has been validated as SSRF safe by the
    # validation check above.
    response = requests.get(
        url, verify=False, headers={'Accept-Encoding': None}
    )  # nosec
    file_content = response.content
    check_tasks_max_file_size(int(response.headers['content-length']))
    file_upload = create_file_upload(
        user, project, SimpleUploadedFile(filename, file_content)
    )
    if file_upload.format_could_be_tasks_list:
        could_be_tasks_list = True
    file_upload_ids.append(file_upload.id)
    tasks, found_formats, data_keys = FileUpload.load_tasks_from_uploaded_files(
        project, file_upload_ids
    )

except ValidationError as e:
    raise e
except Exception as e:
    raise ValidationError(str(e))
return data_keys, found_formats, tasks, file_upload_ids, could_be_tasks_list

```

The validate_upload_url code in label_studio/core/utils/io.py is shown below.

```python def validate_upload_url(url, block_local_urls=True): """Utility function for defending against SSRF attacks. Raises - InvalidUploadUrlError if the url is not HTTP[S], or if block_local_urls is enabled and the URL resolves to a local address. - LabelStudioApiException if the hostname cannot be resolved

:param url: Url to be checked for validity/safety,
:param block_local_urls: Whether urls that resolve to local/private networks should be allowed.
"""

parsed_url = parse_url(url)

if parsed_url.scheme not in ('http', 'https'):
    raise InvalidUploadUrlError

domain = parsed_url.host
try:
    ip = socket.gethostbyname(domain)
except socket.error:
    from core.utils.exceptions import LabelStudioAPIException
    raise LabelStudioAPIException(f"Can't resolve hostname {domain}")

if not block_local_urls:
    return

if ip == '0.0.0.0':  # nosec
    raise InvalidUploadUrlError
local_subnets = [
    '127.0.0.0/8',
    '10.0.0.0/8',
    '172.16.0.0/12',
    '192.168.0.0/16',
]
for subnet in local_subnets:
    if ipaddress.ip_address(ip) in ipaddress.ip_network(subnet):
        raise InvalidUploadUrlError

```

The issue here is the SSRF validation is only performed before the request is sent, and does not validate the destination IP address. Therefore, an attacker can either redirect the request or perform a DNS rebinding attack to bypass this protection.

Proof of Concept

Both the HTTP redirection and DNS rebinding methods for bypassing Label Studio's SSRF protections are explained below.

HTTP Redirection

The python requests module automatically follows HTTP redirects (eg. response code 301 and 302). Therefore, an attacker could use a URL shortener (eg. https://www.shorturl.at/) or host the following Python code on an external server to redirect request from a Label Studio server to an internal web server.

```python from http.server import BaseHTTPRequestHandler, HTTPServer

class RedirectHandler(BaseHTTPRequestHandler):

def do_GET(self):
    self.send_response(301)
    # skip first slash
    self.send_header('Location', self.path[1:])
    self.end_headers()

HTTPServer(("", 8080), RedirectHandler).serve_forever() ```

DNS Rebinding Attack

DNS rebinding can bypass SSRF protections by resolving to an external IP address for the first resolution, but when the request is sent resolves to an internal IP address that is blocked. For an example, the domain 7f000001.030d1fd6.rbndr.us will randomly switch between the IP address 3.13.31.214 that is not blocked to 127.0.0.1 which is not allowed.

Impact

SSRF vulnerabilities pose a significant risk on cloud environments, since instance credentials are managed by internal web APIs. An attacker can bypass Label Studio's SSRF protections to access internal web servers and partially compromise the confidentiality of those internal servers.

Remediation Advice

  • Before saving any responses, validate the destination IP address is not in the deny list.
  • Consider blocking internal cloud API IP ranges to mitigate the risk of compromising cloud credentials.

Discovered

  • August 2023, Alex Brown, elttam
Show details on source website


{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "label-studio"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "1.11.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2023-47116"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-918"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2024-01-31T18:04:40Z",
    "nvd_published_at": "2024-01-31T17:15:13Z",
    "severity": "MODERATE"
  },
  "details": "# Introduction\n\nThis write-up describes a vulnerability found in [Label Studio](https://github.com/HumanSignal/label-studio), a popular open source data labeling tool. The vulnerability affects all versions of Label Studio prior to [`1.11.0`](https://github.com/HumanSignal/label-studio/releases/tag/1.11.0) and was tested on version `1.8.2`.\n\n# Overview\n\nLabel Studio\u0027s SSRF protections that can be enabled by setting the `SSRF_PROTECTION_ENABLED` environment variable can be bypassed to access internal web servers. This is because the current SSRF validation is done by executing a single DNS lookup to verify that the IP address is not in an excluded subnet range. This protection can be bypassed by either using HTTP redirection or performing a [DNS rebinding attack](https://en.wikipedia.org/wiki/DNS_rebinding).\n\n# Description\n\nThe following `tasks_from_url` method in [`label_studio/data_import/uploader.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/data_import/uploader.py#L127-L155) performs the SSRF validation (`validate_upload_url`) before sending the request.\n\n```python\ndef tasks_from_url(file_upload_ids, project, user, url, could_be_tasks_list):\n    \"\"\"Download file using URL and read tasks from it\"\"\"\n    # process URL with tasks\n    try:\n        filename = url.rsplit(\u0027/\u0027, 1)[-1]\n\n        validate_upload_url(url, block_local_urls=settings.SSRF_PROTECTION_ENABLED)\n        # Reason for #nosec: url has been validated as SSRF safe by the\n        # validation check above.\n        response = requests.get(\n            url, verify=False, headers={\u0027Accept-Encoding\u0027: None}\n        )  # nosec\n        file_content = response.content\n        check_tasks_max_file_size(int(response.headers[\u0027content-length\u0027]))\n        file_upload = create_file_upload(\n            user, project, SimpleUploadedFile(filename, file_content)\n        )\n        if file_upload.format_could_be_tasks_list:\n            could_be_tasks_list = True\n        file_upload_ids.append(file_upload.id)\n        tasks, found_formats, data_keys = FileUpload.load_tasks_from_uploaded_files(\n            project, file_upload_ids\n        )\n\n    except ValidationError as e:\n        raise e\n    except Exception as e:\n        raise ValidationError(str(e))\n    return data_keys, found_formats, tasks, file_upload_ids, could_be_tasks_list\n```\n\nThe `validate_upload_url` code in [`label_studio/core/utils/io.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/core/utils/io.py#L174-L209) is shown below.\n\n```python\ndef validate_upload_url(url, block_local_urls=True):\n    \"\"\"Utility function for defending against SSRF attacks. Raises\n        - InvalidUploadUrlError if the url is not HTTP[S], or if block_local_urls is enabled\n          and the URL resolves to a local address.\n        - LabelStudioApiException if the hostname cannot be resolved\n\n    :param url: Url to be checked for validity/safety,\n    :param block_local_urls: Whether urls that resolve to local/private networks should be allowed.\n    \"\"\"\n\n    parsed_url = parse_url(url)\n\n    if parsed_url.scheme not in (\u0027http\u0027, \u0027https\u0027):\n        raise InvalidUploadUrlError\n\n    domain = parsed_url.host\n    try:\n        ip = socket.gethostbyname(domain)\n    except socket.error:\n        from core.utils.exceptions import LabelStudioAPIException\n        raise LabelStudioAPIException(f\"Can\u0027t resolve hostname {domain}\")\n\n    if not block_local_urls:\n        return\n\n    if ip == \u00270.0.0.0\u0027:  # nosec\n        raise InvalidUploadUrlError\n    local_subnets = [\n        \u0027127.0.0.0/8\u0027,\n        \u002710.0.0.0/8\u0027,\n        \u0027172.16.0.0/12\u0027,\n        \u0027192.168.0.0/16\u0027,\n    ]\n    for subnet in local_subnets:\n        if ipaddress.ip_address(ip) in ipaddress.ip_network(subnet):\n            raise InvalidUploadUrlError\n```\n\nThe issue here is the SSRF validation is only performed before the request is sent, and does not validate the destination IP address. Therefore, an attacker can either redirect the request or perform a DNS rebinding attack to bypass this protection.\n\n# Proof of Concept\n\nBoth the HTTP redirection and DNS rebinding methods for bypassing Label Studio\u0027s SSRF protections are explained below.\n\n### HTTP Redirection\n\nThe python `requests` module automatically follows HTTP redirects (eg. response code `301` and `302`). Therefore, an attacker could use a URL shortener (eg. `https://www.shorturl.at/`) or host the following Python code on an external server to redirect request from a Label Studio server to an internal web server.\n\n```python\nfrom http.server import BaseHTTPRequestHandler, HTTPServer\n\nclass RedirectHandler(BaseHTTPRequestHandler):\n\n    def do_GET(self):\n        self.send_response(301)\n        # skip first slash\n        self.send_header(\u0027Location\u0027, self.path[1:])\n        self.end_headers()\n\nHTTPServer((\"\", 8080), RedirectHandler).serve_forever()\n```\n\n### DNS Rebinding Attack\n\nDNS rebinding can bypass SSRF protections by resolving to an external IP address for the first resolution, but when the request is sent resolves to an internal IP address that is blocked. For an example, the domain `7f000001.030d1fd6.rbndr.us` will randomly switch between the IP address `3.13.31.214` that is not blocked to `127.0.0.1` which is not allowed.\n\n# Impact\n\nSSRF vulnerabilities pose a significant risk on cloud environments, since instance credentials are managed by internal web APIs. An attacker can bypass Label Studio\u0027s SSRF protections to access internal web servers and partially compromise the confidentiality of those internal servers.\n\n# Remediation Advice\n\n* Before saving any responses, validate the destination IP address is not in the deny list.\n* Consider blocking internal cloud API IP ranges to mitigate the risk of compromising cloud credentials.\n\n# Discovered\n- August 2023, Alex Brown, elttam",
  "id": "GHSA-p59w-9gqw-wj8r",
  "modified": "2024-11-22T17:56:15Z",
  "published": "2024-01-31T18:04:40Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-p59w-9gqw-wj8r"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2023-47116"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/commit/55dd6af4716b92f2bb213fe461d1ffbc380c6a64"
    },
    {
      "type": "WEB",
      "url": "https://en.wikipedia.org/wiki/DNS_rebinding"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/HumanSignal/label-studio"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/core/utils/io.py#L174-L209"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/data_import/uploader.py#L127-L155"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/releases/tag/1.11.0"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pypa/advisory-database/tree/main/vulns/label-studio/PYSEC-2024-127.yaml"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Label Studio SSRF on Import Bypassing `SSRF_PROTECTION_ENABLED` Protections"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading...

Loading...

Loading...

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.