gsd-2024-0243
Vulnerability from gsd
Modified
2024-01-05 06:02
Details
With the following crawler configuration: ```python from bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text ) docs = loader.load() ``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559
Aliases



{
  "gsd": {
    "metadata": {
      "exploitCode": "unknown",
      "remediation": "unknown",
      "reportConfidence": "confirmed",
      "type": "vulnerability"
    },
    "osvSchema": {
      "aliases": [
        "CVE-2024-0243"
      ],
      "details": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559",
      "id": "GSD-2024-0243",
      "modified": "2024-01-05T06:02:19.749927Z",
      "schema_version": "1.4.0"
    }
  },
  "namespaces": {
    "cve.org": {
      "CVE_data_meta": {
        "ASSIGNER": "security@huntr.com",
        "ID": "CVE-2024-0243",
        "STATE": "PUBLIC"
      },
      "affects": {
        "vendor": {
          "vendor_data": [
            {
              "product": {
                "product_data": [
                  {
                    "product_name": "langchain-ai/langchain",
                    "version": {
                      "version_data": [
                        {
                          "version_affected": "\u003c",
                          "version_name": "unspecified",
                          "version_value": "0.1.0"
                        }
                      ]
                    }
                  }
                ]
              },
              "vendor_name": "langchain-ai"
            }
          ]
        }
      },
      "data_format": "MITRE",
      "data_type": "CVE",
      "data_version": "4.0",
      "description": {
        "description_data": [
          {
            "lang": "eng",
            "value": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559"
          }
        ]
      },
      "impact": {
        "cvss": [
          {
            "attackComplexity": "HIGH",
            "attackVector": "LOCAL",
            "availabilityImpact": "NONE",
            "baseScore": 3.7,
            "baseSeverity": "LOW",
            "confidentialityImpact": "LOW",
            "integrityImpact": "LOW",
            "privilegesRequired": "HIGH",
            "scope": "CHANGED",
            "userInteraction": "REQUIRED",
            "vectorString": "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N",
            "version": "3.0"
          }
        ]
      },
      "problemtype": {
        "problemtype_data": [
          {
            "description": [
              {
                "cweId": "CWE-918",
                "lang": "eng",
                "value": "CWE-918 Server-Side Request Forgery (SSRF)"
              }
            ]
          }
        ]
      },
      "references": {
        "reference_data": [
          {
            "name": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861",
            "refsource": "MISC",
            "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
          },
          {
            "name": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22",
            "refsource": "MISC",
            "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
          },
          {
            "name": "https://github.com/langchain-ai/langchain/pull/15559",
            "refsource": "MISC",
            "url": "https://github.com/langchain-ai/langchain/pull/15559"
          }
        ]
      },
      "source": {
        "advisory": "370904e7-10ac-40a4-a8d4-e2d16e1ca861",
        "discovery": "EXTERNAL"
      }
    },
    "nvd.nist.gov": {
      "cve": {
        "descriptions": [
          {
            "lang": "en",
            "value": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559"
          },
          {
            "lang": "es",
            "value": "Con la siguiente configuraci\u00f3n del rastreador: ```python de bs4 import BeautifulSoup as Soup url = \"https://example.com\" loader = RecursiveUrlLoader( url=url, max_ Depth=2, extractor=lambda x: Soup(x, \"html .parser\").text ) docs = loader.load() ``` Un atacante que controle el contenido de `https://example.com` podr\u00eda colocar un archivo HTML malicioso all\u00ed con enlaces como \"https:/example.completely.different/my_file.html\" y el rastreador proceder\u00eda a descargar ese archivo tambi\u00e9n aunque `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resuelto en https://github.com/langchain-ai/langchain/pull /15559"
          }
        ],
        "id": "CVE-2024-0243",
        "lastModified": "2024-03-13T21:15:55.173",
        "metrics": {
          "cvssMetricV30": [
            {
              "cvssData": {
                "attackComplexity": "HIGH",
                "attackVector": "LOCAL",
                "availabilityImpact": "NONE",
                "baseScore": 3.7,
                "baseSeverity": "LOW",
                "confidentialityImpact": "LOW",
                "integrityImpact": "LOW",
                "privilegesRequired": "HIGH",
                "scope": "CHANGED",
                "userInteraction": "REQUIRED",
                "vectorString": "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N",
                "version": "3.0"
              },
              "exploitabilityScore": 0.6,
              "impactScore": 2.7,
              "source": "security@huntr.dev",
              "type": "Secondary"
            }
          ]
        },
        "published": "2024-02-26T16:27:49.670",
        "references": [
          {
            "source": "security@huntr.dev",
            "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
          },
          {
            "source": "security@huntr.dev",
            "url": "https://github.com/langchain-ai/langchain/pull/15559"
          },
          {
            "source": "security@huntr.dev",
            "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
          }
        ],
        "sourceIdentifier": "security@huntr.dev",
        "vulnStatus": "Awaiting Analysis",
        "weaknesses": [
          {
            "description": [
              {
                "lang": "en",
                "value": "CWE-918"
              }
            ],
            "source": "security@huntr.dev",
            "type": "Primary"
          }
        ]
      }
    }
  }
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading...

Loading...

Loading...

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.