gsd-2024-0243
Vulnerability from gsd
Modified
2024-01-05 06:02
Details
With the following crawler configuration:
```python
from bs4 import BeautifulSoup as Soup
url = "https://example.com"
loader = RecursiveUrlLoader(
url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text
)
docs = loader.load()
```
An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`.
https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51
Resolved in https://github.com/langchain-ai/langchain/pull/15559
Aliases
{ "gsd": { "metadata": { "exploitCode": "unknown", "remediation": "unknown", "reportConfidence": "confirmed", "type": "vulnerability" }, "osvSchema": { "aliases": [ "CVE-2024-0243" ], "details": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559", "id": "GSD-2024-0243", "modified": "2024-01-05T06:02:19.749927Z", "schema_version": "1.4.0" } }, "namespaces": { "cve.org": { "CVE_data_meta": { "ASSIGNER": "security@huntr.com", "ID": "CVE-2024-0243", "STATE": "PUBLIC" }, "affects": { "vendor": { "vendor_data": [ { "product": { "product_data": [ { "product_name": "langchain-ai/langchain", "version": { "version_data": [ { "version_affected": "\u003c", "version_name": "unspecified", "version_value": "0.1.0" } ] } } ] }, "vendor_name": "langchain-ai" } ] } }, "data_format": "MITRE", "data_type": "CVE", "data_version": "4.0", "description": { "description_data": [ { "lang": "eng", "value": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559" } ] }, "impact": { "cvss": [ { "attackComplexity": "HIGH", "attackVector": "LOCAL", "availabilityImpact": "NONE", "baseScore": 3.7, "baseSeverity": "LOW", "confidentialityImpact": "LOW", "integrityImpact": "LOW", "privilegesRequired": "HIGH", "scope": "CHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N", "version": "3.0" } ] }, "problemtype": { "problemtype_data": [ { "description": [ { "cweId": "CWE-918", "lang": "eng", "value": "CWE-918 Server-Side Request Forgery (SSRF)" } ] } ] }, "references": { "reference_data": [ { "name": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861", "refsource": "MISC", "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861" }, { "name": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22", "refsource": "MISC", "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22" }, { "name": "https://github.com/langchain-ai/langchain/pull/15559", "refsource": "MISC", "url": "https://github.com/langchain-ai/langchain/pull/15559" } ] }, "source": { "advisory": "370904e7-10ac-40a4-a8d4-e2d16e1ca861", "discovery": "EXTERNAL" } }, "nvd.nist.gov": { "cve": { "descriptions": [ { "lang": "en", "value": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559" }, { "lang": "es", "value": "Con la siguiente configuraci\u00f3n del rastreador: ```python de bs4 import BeautifulSoup as Soup url = \"https://example.com\" loader = RecursiveUrlLoader( url=url, max_ Depth=2, extractor=lambda x: Soup(x, \"html .parser\").text ) docs = loader.load() ``` Un atacante que controle el contenido de `https://example.com` podr\u00eda colocar un archivo HTML malicioso all\u00ed con enlaces como \"https:/example.completely.different/my_file.html\" y el rastreador proceder\u00eda a descargar ese archivo tambi\u00e9n aunque `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resuelto en https://github.com/langchain-ai/langchain/pull /15559" } ], "id": "CVE-2024-0243", "lastModified": "2024-03-13T21:15:55.173", "metrics": { "cvssMetricV30": [ { "cvssData": { "attackComplexity": "HIGH", "attackVector": "LOCAL", "availabilityImpact": "NONE", "baseScore": 3.7, "baseSeverity": "LOW", "confidentialityImpact": "LOW", "integrityImpact": "LOW", "privilegesRequired": "HIGH", "scope": "CHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N", "version": "3.0" }, "exploitabilityScore": 0.6, "impactScore": 2.7, "source": "security@huntr.dev", "type": "Secondary" } ] }, "published": "2024-02-26T16:27:49.670", "references": [ { "source": "security@huntr.dev", "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22" }, { "source": "security@huntr.dev", "url": "https://github.com/langchain-ai/langchain/pull/15559" }, { "source": "security@huntr.dev", "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861" } ], "sourceIdentifier": "security@huntr.dev", "vulnStatus": "Awaiting Analysis", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-918" } ], "source": "security@huntr.dev", "type": "Primary" } ] } } } }
Loading...
Loading...
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.