GHSA-H36F-RQPX-J5WX

Vulnerability from github – Published: 2026-05-08 20:03 – Updated: 2026-05-15 23:53
VLAI
Summary
Open WebUI has Unauthorized File and Knowledge Base Content Access via RAG Vector Search
Details

Unauthorized File and Knowledge Base Content Access via RAG Vector Search

Affected Component

RAG source resolution in chat completion pipeline: - backend/open_webui/retrieval/utils.py (lines 963-965, 1063-1068, 1126-1131 in get_sources_from_items)

Affected Versions

Current main branch (commit 6fdd19bf1) and likely all versions with RAG functionality.

Description

The get_sources_from_items function resolves file and knowledge base references into vector search queries during chat completion. Three of the five code paths perform vector store queries without any authorization check, allowing users to extract content from files and knowledge bases they do not have access to.

Path Lines Access Check
type: "file", full-context 1044-1050 has_access_to_file
type: "file", non-full-context (default) 1063-1068 ❌ None
type: "collection" 1070-1118 ✅ Present
type: "text" with collection_name 963-965 ❌ None
Bare collection_name/collection_names 1126-1131 ❌ None

The three unprotected paths pass user-supplied collection names directly to query_collection(), which queries the vector store without any authorization. Collection names follow predictable formats: file-<file_id> for files and the knowledge base UUID for knowledge bases.

CVSS 3.1 Breakdown

Metric Value Rationale
Attack Vector Network (N) Exploited remotely via chat completion API
Attack Complexity Low (L) Single API call with a known resource ID
Privileges Required Low (L) Requires a valid user account
User Interaction None (N) No victim interaction required
Scope Unchanged (U) Impact within the application's data boundary
Confidentiality High (H) Full content of private files/knowledge bases extractable
Integrity None (N) No data modification
Availability None (N) No denial of service

Attack Scenario

  1. User A uploads a private document and uses it in RAG (the document is embedded into the vector store as collection file-<file_id>).
  2. User A shares a chat or model referencing the file with User B, or User B otherwise obtains the file ID through a legitimate interaction.
  3. User A later revokes User B's access to the file.
  4. User B sends a chat completion request referencing the revoked file: json POST /api/chat/completions { "model": "any-accessible-model", "messages": [{"role": "user", "content": "What does this document say about pricing?"}], "files": [{"type": "file", "id": "<revoked_file_id>"}] }
  5. The non-full-context path (default) constructs collection name file-<id> and queries the vector store with no access check.
  6. Matching chunks are injected into the LLM context, and the response contains the victim's private file content.

The same attack works via {"type": "text", "collection_name": "<knowledge_base_id>"} for knowledge bases.

Impact

  • Access revocation is ineffective for RAG content — users who previously had access can continue extracting file and knowledge base content indefinitely
  • Private document content can be systematically extracted through targeted queries
  • Breaks the access control model for files and knowledge bases at the RAG layer

Preconditions

  • Attacker must know the file ID or knowledge base ID (UUID) of the target resource
  • The target file/knowledge base must have been processed into the vector store
  • Attacker must have a valid user account
Show details on source website

{
  "affected": [
    {
      "database_specific": {
        "last_known_affected_version_range": "\u003c= 0.8.12"
      },
      "package": {
        "ecosystem": "PyPI",
        "name": "open-webui"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.9.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-44560"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-862"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-05-08T20:03:09Z",
    "nvd_published_at": "2026-05-15T20:16:47Z",
    "severity": "MODERATE"
  },
  "details": "# Unauthorized File and Knowledge Base Content Access via RAG Vector Search\n\n## Affected Component\n\nRAG source resolution in chat completion pipeline:\n- `backend/open_webui/retrieval/utils.py` (lines 963-965, 1063-1068, 1126-1131 in `get_sources_from_items`)\n\n## Affected Versions\n\nCurrent main branch (commit `6fdd19bf1`) and likely all versions with RAG functionality.\n\n## Description\n\nThe `get_sources_from_items` function resolves file and knowledge base references into vector search queries during chat completion. Three of the five code paths perform vector store queries without any authorization check, allowing users to extract content from files and knowledge bases they do not have access to.\n\n| Path | Lines | Access Check |\n|------|-------|-------------|\n| `type: \"file\"`, full-context | 1044-1050 | \u2705 `has_access_to_file` |\n| `type: \"file\"`, non-full-context (default) | 1063-1068 | \u274c None |\n| `type: \"collection\"` | 1070-1118 | \u2705 Present |\n| `type: \"text\"` with `collection_name` | 963-965 | \u274c None |\n| Bare `collection_name`/`collection_names` | 1126-1131 | \u274c None |\n\nThe three unprotected paths pass user-supplied collection names directly to `query_collection()`, which queries the vector store without any authorization. Collection names follow predictable formats: `file-\u003cfile_id\u003e` for files and the knowledge base UUID for knowledge bases.\n\n## CVSS 3.1 Breakdown\n\n| Metric | Value | Rationale |\n|--------|-------|-----------|\n| Attack Vector | Network (N) | Exploited remotely via chat completion API |\n| Attack Complexity | Low (L) | Single API call with a known resource ID |\n| Privileges Required | Low (L) | Requires a valid user account |\n| User Interaction | None (N) | No victim interaction required |\n| Scope | Unchanged (U) | Impact within the application\u0027s data boundary |\n| Confidentiality | High (H) | Full content of private files/knowledge bases extractable |\n| Integrity | None (N) | No data modification |\n| Availability | None (N) | No denial of service |\n\n## Attack Scenario\n\n1. User A uploads a private document and uses it in RAG (the document is embedded into the vector store as collection `file-\u003cfile_id\u003e`).\n2. User A shares a chat or model referencing the file with User B, or User B otherwise obtains the file ID through a legitimate interaction.\n3. User A later revokes User B\u0027s access to the file.\n4. User B sends a chat completion request referencing the revoked file:\n   ```json\n   POST /api/chat/completions\n   {\n     \"model\": \"any-accessible-model\",\n     \"messages\": [{\"role\": \"user\", \"content\": \"What does this document say about pricing?\"}],\n     \"files\": [{\"type\": \"file\", \"id\": \"\u003crevoked_file_id\u003e\"}]\n   }\n   ```\n5. The non-full-context path (default) constructs collection name `file-\u003cid\u003e` and queries the vector store with no access check.\n6. Matching chunks are injected into the LLM context, and the response contains the victim\u0027s private file content.\n\nThe same attack works via `{\"type\": \"text\", \"collection_name\": \"\u003cknowledge_base_id\u003e\"}` for knowledge bases.\n\n## Impact\n\n- Access revocation is ineffective for RAG content \u2014 users who previously had access can continue extracting file and knowledge base content indefinitely\n- Private document content can be systematically extracted through targeted queries\n- Breaks the access control model for files and knowledge bases at the RAG layer\n\n## Preconditions\n\n- Attacker must know the file ID or knowledge base ID (UUID) of the target resource\n- The target file/knowledge base must have been processed into the vector store\n- Attacker must have a valid user account",
  "id": "GHSA-h36f-rqpx-j5wx",
  "modified": "2026-05-15T23:53:30Z",
  "published": "2026-05-08T20:03:09Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/open-webui/open-webui/security/advisories/GHSA-h36f-rqpx-j5wx"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-44560"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/open-webui/open-webui"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Open WebUI has Unauthorized File and Knowledge Base Content Access via RAG Vector Search"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Forecast uses a logistic model when the trend is rising, or an exponential decay model when the trend is falling. Fitted via linearized least squares.

Sightings

Author Source Type Date Other

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…