GHSA-X3M8-F7G5-QHM7
Vulnerability from github – Published: 2025-03-19 15:55 – Updated: 2025-07-02 14:20Summary
When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP will allow attackers to execute remote code on distributed hosts.
Details
- Pickle deserialization vulnerabilities are well documented.
- The mooncake pipe is exposed over the network (by design to enable disaggregated prefilling across distributed environments) using ZMQ over TCP, greatly increasing exploitability. ~~Further, the mooncake integration opens these sockets listening on all interfaces on the host, meaning it can not be configured to only use a private, trusted network.~~
Only sender_socket and receiver_ack are allowed to be accessed publicly, while the data actually decompressed by pickle.loads() comes from recv_bytes. Its interface is defined as self.receiver_socket.connect(f\"tcp://{d_host}:{d_rank_offset + 1}\"), where d_host is decode_host, a locally defined address 192.168.0.139,from mooncake.json (https://github.com/kvcache-ai/Mooncake/blob/main/doc/en/vllm-integration-v0.2.md?plain=1#L36).
- The root problem is
recv_tensor()calls_recv_implwhich passes the raw network bytes topickle.loads(). Additionally, it does not appear that there are any controls (network, authentication, etc) to prevent arbitrary users from sending this payload to the affected service.
Impact
This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts.
Remediation
This issue is resolved by https://github.com/vllm-project/vllm/pull/14228
{
"affected": [
{
"package": {
"ecosystem": "PyPI",
"name": "vllm"
},
"ranges": [
{
"events": [
{
"introduced": "0.6.5"
},
{
"fixed": "0.8.0"
}
],
"type": "ECOSYSTEM"
}
]
}
],
"aliases": [
"CVE-2025-29783"
],
"database_specific": {
"cwe_ids": [
"CWE-502"
],
"github_reviewed": true,
"github_reviewed_at": "2025-03-19T15:55:58Z",
"nvd_published_at": "2025-03-19T16:15:32Z",
"severity": "CRITICAL"
},
"details": "### Summary\nWhen vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP will allow attackers to execute remote code on distributed hosts.\n\n### Details\n1. Pickle deserialization vulnerabilities are [well documented](https://docs.python.org/3/library/pickle.html).\n2. The [mooncake pipe](https://github.com/vllm-project/vllm/blob/9bebc9512f9340e94579b9bd69cfdc452c4d5bb0/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L206) is exposed over the network (by design to enable disaggregated prefilling across distributed environments) using ZMQ over TCP, greatly increasing exploitability. ~~Further, the mooncake integration opens these sockets listening on all interfaces on the host, meaning it can not be configured to only use a private, trusted network.~~\n\nOnly `sender_socket` and `receiver_ack` are allowed to be accessed publicly, while the data actually decompressed by `pickle.loads()` comes from [recv_bytes](https://github.com/vllm-project/vllm/blob/9bebc9512f9340e94579b9bd69cfdc452c4d5bb0/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L257). Its interface is defined as `self.receiver_socket.connect(f\\\"tcp://{d_host}:{d_rank_offset + 1}\\\")`, where `d_host` is `decode_host`, a locally defined address 192.168.0.139,from mooncake.json (https://github.com/kvcache-ai/Mooncake/blob/main/doc/en/vllm-integration-v0.2.md?plain=1#L36).\n\n3. The root problem is [`recv_tensor()`](https://github.com/vllm-project/vllm/blob/9bebc9512f9340e94579b9bd69cfdc452c4d5bb0/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L257) calls [`_recv_impl`](https://github.com/vllm-project/vllm/blob/9bebc9512f9340e94579b9bd69cfdc452c4d5bb0/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L244) which passes the raw network bytes to `pickle.loads()`. Additionally, it does not appear that there are any controls (network, authentication, etc) to prevent arbitrary users from sending this payload to the affected service.\n\n\n\n### Impact\nThis is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts.\n\n### Remediation\nThis issue is resolved by https://github.com/vllm-project/vllm/pull/14228",
"id": "GHSA-x3m8-f7g5-qhm7",
"modified": "2025-07-02T14:20:40Z",
"published": "2025-03-19T15:55:58Z",
"references": [
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7"
},
{
"type": "ADVISORY",
"url": "https://nvd.nist.gov/vuln/detail/CVE-2025-29783"
},
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/pull/14228"
},
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2"
},
{
"type": "WEB",
"url": "https://github.com/pypa/advisory-database/tree/main/vulns/vllm/PYSEC-2025-63.yaml"
},
{
"type": "PACKAGE",
"url": "https://github.com/vllm-project/vllm"
}
],
"schema_version": "1.4.0",
"severity": [
{
"score": "CVSS:3.1/AV:A/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H",
"type": "CVSS_V3"
}
],
"summary": "vLLM Allows Remote Code Execution via Mooncake Integration"
}
Sightings
| Author | Source | Type | Date |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.