{"uuid": "4a76e606-34ef-4eeb-af68-5e7a7cf64161", "vulnerability_lookup_origin": "1a89b78e-f703-45f3-bb86-59eb712668bd", "author": "9f56dd64-161d-43a6-b9c3-555944290a09", "vulnerability": "CVE-2021-44228", "type": "seen", "source": "https://gist.github.com/ppkarwasz/485782a1a04a1940c47f4fa017d1767f", "content": "= Apache Flume Threat Model\n:toc: macro\n:toclevels: 3\n:sectnums:\n\n[cols=\"1,4\"]\n|===\n| Project          | Apache Flume (https://github.com/apache/logging-flume)\n| Version/commit   | `main` @ `ff739dbe` (2024-10-10, \"Announce dormant status\"); latest release 1.11.0\n| Date             | 2026-05-13\n| Author           | Piotr P. Karwasz (Apache Logging Services PMC), draft v0\n| Status           | *DRAFT* \u2014 pre-PMC review\n| Reporting        | Findings under \u00a74.8 follow the ASF process at https://www.apache.org/security/. Findings under \u00a74.3 / \u00a74.9 / \u00a74.11a are closed citing this document.\n| Provenance       | `(documented)` = stated in project / ASF artifacts (cite source); `(maintainer)` = stated by a Flume PMC member in response to this process; `(inferred)` = reasoned from code or absence \u2014 every such tag has a matching open question in \u00a74.14.\n| Draft confidence | ~24 documented / 0 maintainer / ~35 inferred (counts approximate; update on each revision)\n|===\n\n[abstract]\n.About Flume\nApache Flume is a distributed service for collecting, aggregating, and moving large volumes of event-oriented data (originally log lines) from many *agents* to centralized stores such as HDFS, HBase, Hive, Kafka, Solr, or ElasticSearch. A Flume *agent* is a long-running JVM process configured with a topology of *sources* (which receive events), *channels* (which buffer them), and *sinks* (which forward them, often to another agent or to a terminal store). Agents are configured by a properties file and run as a Linux/Windows daemon; they are not embedded in user applications (with the exception of the small *embedded agent* and *log4j appender* clients).\n\nThis threat model describes the *implicit contract* between the project and downstream operators: the assumptions Flume makes about its environment and operators, the security properties it provides, the properties it explicitly does not provide, and the misuses that fall outside its intended use. It is not an audit; it does not enumerate bugs or recommend fixes.\n\ntoc::[]\n\n== Header and scope context\n\n=== Project status (dormant)\n\nThe Apache Logging Services PMC moved Flume to *dormant* status on 2024-10-10. The published policy is: bug reports and feature requests are unlikely to be addressed; *security reports are addressed*; new releases are unlikely. _(documented: README.md; logging-flume issue #423; lists.apache.org thread `dg9wro6dp7w95o1x911lbyqxzl808b3l`)_\n\nThis threat model is written *against the dormant posture*. That has three consequences for triage:\n\n. The *(documented)* \"we will address security reports\" commitment is what makes a threat model worth writing at all. It is the load-bearing claim under \u00a74.13's `VALID` row.\n. \"Fixed in a release\" is no longer a default disposition; many `VALID` outcomes will be addressed by an advisory plus guidance to migrate, rather than a CVE+fix release. The model should not promise what the project will not deliver.\n. Anything the model would have said about *future evolution* (`\u00a74.12 Conditions that would change this model`) collapses: the dominant condition is \"Flume returns to ACTIVE status or new contributors take on a component.\"\n\n[NOTE]\n====\n*Open meta question for the PMC (\u00a74.14, wave 1):* whether the published threat model should explicitly recommend migration away from Flume in \u00a74.10, or remain neutral and let the README banner carry that message. Both are coherent; the choice is editorial.\n====\n\n== Coexistence with existing security documentation\n\nFlume has no in-repo `SECURITY.md`, no in-repo threat-model document, and no security-scoped FAQ. _(documented: `find . -iname \"security*\" -o -iname \"threat*\"` returns nothing under the repo root.)_ The closest existing artifacts are:\n\n* The README warning and the linked Logging Services dormant-projects announcement _(documented)_.\n* The ASF-wide security policy at `https://www.apache.org/security/` _(documented)_, which is the reporting channel of record.\n* The Flume User Guide's per-component documentation, which states SSL/Kerberos options inline with each source/sink. The User Guide is a *capability* document, not a policy document \u2014 it says what _can_ be configured, not what the project _commits to_.\n* The CHANGELOG / JIRA history, which records past CVE responses (FLUME-3395 for CVE-2021-44228, FLUME-3356, FLUME-3426, etc.) _(documented)_.\n\nPer \u00a73.1a of the skill, the new threat model becomes the canonical security-policy artifact for the project and should be linked from the README banner alongside the dormant-status link. A short `SECURITY.md` pointing to this document and to `apache.org/security/` would also be appropriate _(open question \u00a74.14)_.\n\n== Component-family table (\u00a74.2)\n\nFlume is not one component \u2014 it is a *runtime* (the agent process) plus an *open-ended catalog* of bundled sources, channels, sinks, interceptors, serializers, and selectors. Triage decisions hinge on which family a finding lands in. The model carves the surface as follows:\n\n[cols=\"2,3,1,1,2\", options=\"header\"]\n|===\n| Family\n| Representative entry point\n| Touches OS / network?\n| In model?\n| Notes\n\n| *Agent runtime* (node, app, configuration provider)\n| `flume-ng-node:Application`\n| process boundary, filesystem, optional HTTP/ZK config fetch\n| *Yes*\n| The core daemon. Loads and reloads the topology configuration.\n\n| *Configuration providers*\n| `Properties`, `URI`-based, `HTTP(S)`, `ZooKeeper`\n| filesystem, HTTP(S), ZooKeeper\n| *Yes*\n| `HttpConfigurationSource` introduced in 1.10 (FLUME-3335) supports HTTP(S) with optional basic-auth.\n\n| *Configuration filters*\n| `flume-ng-configfilters/*`\n| env vars, external process, Hadoop credential store\n| *Yes*\n| Resolve indirected secrets at agent start. `external-process` shells out.\n\n| *Core network sources* (Avro, Thrift, HTTP, Syslog TCP/UDP, Netcat, MultiportSyslog)\n| `flume-ng-core/source/*`\n| network listener\n| *Yes*\n| Primary attacker-reachable surface. SSL/TLS support is per-source and optional.\n\n| *Core OS sources* (ExecSource, SpoolDirectorySource, StressSource, SequenceGenerator)\n| `flume-ng-core/source/*`\n| filesystem, child processes\n| *Yes \u2014 but operator-trusted inputs only*\n| `ExecSource` and `SpoolDirectorySource` deliberately consume operator-controlled inputs.\n\n| *Bundled sources (modular)*: JMS, Scribe, Taildir\n| `flume-ng-sources/*`\n| filesystem, JMS broker, network\n| *Yes*\n|\n\n| *Bundled channels*: Memory, File, Spillable-Memory, Kafka, JDBC\n| `flume-ng-core/channel/*`, `flume-ng-channels/*`\n| filesystem, Kafka cluster, JDBC\n| *Yes*\n| File channel optionally provides AES/CTR encryption-at-rest.\n\n| *Bundled sinks (core)*: HDFS, HBase(2), Hive, ElasticSearch, Kafka, Avro, Thrift, RollingFile, Null, Logger\n| `flume-ng-core/sink/*`, `flume-ng-sinks/*` and split modules\n| network egress (mostly to authenticated downstreams)\n| *Yes*\n| Sinks generally talk to trusted, authenticated downstreams configured by the operator.\n\n| *Bundled sinks (modular)*: HTTP, IRC, Morphline-Solr\n| `flume-ng-sinks/*`\n| network egress\n| *Yes*\n|\n\n| *Interceptors, selectors, serializers*\n| `flume-ng-core/interceptor/*`, `serialization/*`\n| in-process only\n| *Yes*\n| Operate on event bodies / headers in memory.\n\n| *RPC client SDK* (`flume-ng-sdk`)\n| `org.apache.flume.api.*RpcClient`\n| network egress\n| *Yes \u2014 caller side*\n| Library used by applications and by `AvroSink`/`ThriftSink` to talk to Flume agents.\n\n| *Embedded agent* (`flume-ng-embedded-agent`)\n| `EmbeddedAgent`\n| in-process; channel I/O\n| *Yes*\n| Small library that runs an agent inside a host application.\n\n| *log4j 1.x appender*\n| `flume-ng-clients` (`Log4jAppender`, `LoadBalancingLog4jAppender`)\n| network egress to Avro source\n| *Yes \u2014 caller side*\n| log4j 1.x is itself end-of-life; see \u00a74.5.\n\n| *Monitoring endpoints*: JMX, JSON HTTP, Prometheus HTTP, Ganglia\n| `instrumentation/*`\n| network listener (HTTP) / UDP (Ganglia)\n| *Yes*\n| Unauthenticated by design _(see \u00a74.9)_.\n\n| *`flume-tools`* (CLI utilities)\n| `flume-tools/*`\n| filesystem\n| *Yes*\n| Operator-only CLI; trusted-input surface.\n\n| *`flume-ng-tests`* (integration tests)\n| repo only, not shipped runtime\n| n/a\n| *No (out of scope \u00a74.3)*\n| Test scaffolding.\n\n| Third-party / `vendor` / `contrib` directories\n| n/a\n| n/a\n| *No \u2014 there are none*\n| Flume has no `contrib/` or vendored third-party source tree in the runtime distribution. _(inferred from `ls -la`)_\n|===\n\n== Out of scope (\u00a74.3)\n\nUse cases and threats Flume *does not* attempt to address. Findings in any of these categories are closed `OUT-OF-MODEL` per the row indicated.\n\n. *Flume OG / 0.9.x.* The 1.x line is a complete rewrite; nothing in this model applies to the pre-1.0 codebase. _(documented: README.md)_\n. *Flume as a public-internet-facing service.* Flume agents are intended to run inside a trusted network (e.g., one operator's data-plane) and to be addressable only by other agents, by trusted producers, or by operators. Exposing an Avro, Thrift, HTTP, or Syslog source directly to the public internet is out of model. _(inferred from documentation pattern and the absence of any default authentication on most sources)_\n. *Flume as a multi-tenant security boundary.* Within a single agent there is no isolation between sources/sinks/channels: any sink can in principle read events from any channel the configuration wires it to; any interceptor sees every event flowing through. The agent is *one* trust domain. _(inferred from architecture)_\n. *Operator-controlled inputs treated as adversarial.* `ExecSource` runs an operator-specified command. `SpoolDirectorySource` reads files from an operator-specified directory. The agent configuration file itself is operator-controlled. None of these are attacker-controlled in the threat model; a finding that requires an attacker to write to the spool directory, supply the exec command line, or alter `flume.conf` is `OUT-OF-MODEL: trusted-input`. _(inferred \u2014 wave-1 question for the PMC)_\n. *log4j 1.x.* The `Log4jAppender` lives in the repository for backward compatibility with applications that still emit log4j 1.x events; log4j 1.x itself reached EOL in 2015 _(documented: Apache Logging Services 2015-08 announcement)_. CVEs in log4j 1.x are out of scope for this model; CVEs in the Flume *appender code* are in scope.\n. *Supply-chain hygiene of Flume's runtime dependencies.* Flume 1.11.0 / 1.12.0 pin dated versions of Jetty, Hadoop, Kafka client, Jackson, Avro, Thrift, Netty, and many others. The dormant project does not commit to keeping these patched _(documented: dormant-status policy)_. A report that a *transitive* dependency has a CVE, *without* a demonstration that the CVE is reachable from Flume's documented inputs, is `OUT-OF-MODEL`. Reports that a CVE *is* reachable via a documented Flume input remain `VALID` under \u00a74.13.\n. *Build, release, and SDLC hygiene of the project itself* (action pinning, branch protection, reproducible builds, signing of release artifacts beyond ASF defaults). Out of scope per skill \u00a71.\n. `flume-ng-tests` *integration test scaffolding*, examples in user-guide snippets, sample configurations under `conf/*.template`. _(inferred)_\n\n== Trust boundaries and data flow (\u00a74.4)\n\n[cols=\"1,2,2\", options=\"header\"]\n|===\n| Boundary\n| Inside (trusted)\n| Outside (in-scope adversary)\n\n| Agent process boundary\n| All in-process code: every source, channel, sink, interceptor, serializer, the configuration provider, and any plugin JAR on the configured classpath\n| Other processes on the host; networks the agent is *not* configured to listen on\n\n| Configuration file\n| The `.properties` agent configuration *as written to disk by the operator*, including secrets resolved via config-filters at startup\n| Anyone who can modify the file is *already trusted* (and a finding requiring this is out of model per \u00a74.3)\n\n| Network source (Avro/Thrift/HTTP/Syslog/Netcat)\n| The framed bytes after deserialization by the source handler, once committed to the channel\n| The bytes-on-the-wire from clients, *whether or not* TLS is configured. If TLS is configured, after the TLS termination the bytes are still attacker-controlled at the application layer.\n\n| File-system source (SpoolDirectory, Taildir)\n| The file *contents*, the directory *path*, and the file *names* \u2014 all assumed operator-managed\n| n/a \u2014 these are operator-trusted inputs per \u00a74.3.\n\n| Channel boundary\n| Events committed to a channel transaction\n| n/a \u2014 internal\n\n| Sink \u2192 external system\n| The sink writes to a downstream that *the operator has authenticated against* (HDFS Kerberos, Hive principal, Kafka SSL/SASL, etc.). The downstream is trusted by the operator's choice, not by Flume.\n| Network attacker between agent and downstream (defended by the downstream's own transport, when the operator enables it).\n\n| Monitoring HTTP endpoint\n| The JSON metrics payload\n| Anyone who can reach the monitoring port. This boundary is *intentionally permissive* per \u00a74.9.\n\n| Inter-agent RPC (`AvroSink` / `ThriftSink` \u2192 `AvroSource` / `ThriftSource`)\n| Bytes after the source has deserialized them\n| The wire. Mutual TLS is optional and must be enabled at both ends.\n|===\n\n=== Reachability preconditions per family\n\nThe \u00a74.4 boundary table licenses these per-family tests. A triager applies the relevant one before assigning a \u00a74.13 disposition.\n\n* *Network sources (Avro, Thrift, HTTP, Syslog, Netcat, MultiportSyslog, Scribe).* In-model only if reachable from an unauthenticated remote peer's bytes via the configured handler's parsing path.\n* *JMS, Kafka, ZooKeeper.* In-model only if reachable from broker/peer-supplied bytes; broker administration is operator-trusted.\n* *Taildir, SpoolDirectory, Exec.* In-model only if reachable from the operator-trusted input \u2192 channel pipeline *without* an attacker controlling the file path, command, or contents.\n* *Configuration provider.* In-model only if reachable from the bytes the agent fetches from the configured source (HTTP, ZooKeeper, or file). For the HTTP provider, in-model only when an attacker can serve those bytes; for the file/classpath provider, never in-model (operator-trusted).\n* *Sinks.* In-model only if reachable from an event the channel can deliver (i.e., from one of the in-model source paths above).\n* *RPC SDK / Log4jAppender / embedded agent.* In-model only if reachable from attacker-controlled event data fed through the API by the embedding application *and* the failure mode affects the embedding process, not just the host application logic.\n* *Monitoring (HTTP/JMX/Prometheus/Ganglia).* In-model only as a *resource* / *availability* surface; information disclosure of metrics is explicitly disclaimed in \u00a74.9.\n\n== Environmental assumptions (\u00a74.5)\n\n* *Java*: Oracle/OpenJDK 8 build target _(documented: README.md, `pom.xml`)_. Newer JDKs may work for execution but are not first-class. The wave-3 question is whether the PMC supports running on JDK 11/17/21 for security purposes _(open \u00a74.14)_.\n* *OS*: Linux is first-class; Windows is supported by the bundled `flume-env.ps1.template` and `bin/` scripts but is not the primary test target. _(inferred)_\n* *Threading*: Sources, channels, and sinks run in distinct threads; the channel transaction API is thread-safe by contract. _(inferred from API contract)_\n* *Filesystem*: File-channel and Taildir source assume POSIX semantics (rename-is-atomic, mtime-monotone). On filesystems that do not provide these (NFS, some FUSE filesystems), reliability claims weaken. _(inferred from prior FLUME bug history)_\n* *Clock*: Several interceptors (TimestampInterceptor, `HostInterceptor`) and sinks (HDFS bucket selection) depend on the system clock. A skewed clock degrades correctness but is not a security finding under the model. _(inferred)_\n* *Process privilege*: The agent is expected to run as an *unprivileged service account*. Sources that bind privileged ports (Syslog 514) require the operator to grant capabilities; Flume itself does not. _(inferred \u2014 wave-2 question)_\n\n=== What Flume does *not* do to its host\n\nA non-exhaustive negative-claim inventory _(all (inferred); wave-2 questions)_:\n\n* The agent does not install signal handlers beyond JVM defaults.\n* The agent does not `fork()` or `exec()` arbitrary child processes *except* via:\n  ** `ExecSource` (operator-supplied command, by design)\n  ** `ExternalProcessConfigFilter` (operator-supplied command, by design)\n  ** Any sink that delegates to a Hadoop/Hive/HBase client that itself shells out (out of Flume's control)\n* The agent does not modify process-wide JVM state beyond installing JMX MBeans and Jetty MBeans when monitoring is enabled.\n* The agent does not read environment variables *except* via `EnvVarResolverProperties` (when explicitly enabled) and `EnvironmentVariableConfigFilter` (when configured). _(open question whether the agent reads any env var unconditionally at startup)_\n* The agent does not write outside the directories named in its configuration (file-channel data/checkpoint dirs, HDFS local cache if any, log4j2 log directory). _(open question)_\n\n== Build-time and configuration variants (\u00a74.5a)\n\nFlume is shipped as a single binary distribution; there are no compile-time `-D` defines that change security behavior. Security-relevant variation is *runtime* configuration:\n\n[cols=\"2,2,1,2,3\", options=\"header\"]\n|===\n| Knob\n| Default\n| Less-secure default?\n| Maintainer stance (PMC ruling needed)\n| Section impact\n\n| Network sources `ssl = false`\n| `false` _(documented: User Guide)_\n| *Yes*\n| _(open \u00a74.14 wave 1)_ \u2014 likely \"operator must enable for production\"\n| \u00a74.10\n\n| HTTP/HTTPS configuration provider `auth = none`\n| `none`\n| *Yes*\n| _(open \u00a74.14 wave 1)_\n| \u00a74.10\n\n| HTTP metrics server (`type = http`)\n| no TLS, no auth available at all\n| *N/A \u2014 by design*\n| `OUT-OF-MODEL: by design` per \u00a74.9\n| \u00a74.9, \u00a74.11a\n\n| File channel `encryption = none`\n| no encryption-at-rest\n| *Possibly*\n| _(open \u00a74.14)_ \u2014 likely \"operator's choice; encryption is for media-loss threats, not in-host adversaries\"\n| \u00a74.9\n\n| ZooKeeper config provider ACLs\n| inherits from cluster\n| ?\n| _(open \u00a74.14)_\n| \u00a74.10\n\n| Kerberos for HDFS / HBase / Hive / Kafka sinks\n| disabled if not configured\n| *Yes* in most production deployments\n| _(open \u00a74.14)_ \u2014 likely \"operator's choice; Flume does not impose\"\n| \u00a74.10\n\n| `LaxHostnameVerifier` (in `flume-ng-node/net`)\n| not the default\n| n/a (the *non*-lax verifier is default)\n| _(inferred \u2014 needs confirmation)_ Is `LaxHostnameVerifier` ever wired in by default? If not, document it as opt-in.\n| \u00a74.9\n\n| `ExternalProcessConfigFilter`\n| not enabled\n| n/a\n| Operator-only by definition; in-model only that the *invocation mechanics* are safe given an operator-trusted command.\n| \u00a74.6\n\n| log4j2 logging configuration (Flume's own logs)\n| `conf/log4j2.xml`\n| n/a\n| Plugin lookups are restricted post-FLUME-3433 _(documented)_; lookup-related findings against Flume's own logs are `KNOWN-NON-FINDING` if the default config is in use.\n| \u00a74.11a\n|===\n\n== Inputs and trust (\u00a74.6)\n\n=== Inputs Flume accepts\n\n. *Network bytes* on configured listener ports (one source per port). Format varies by source: Avro IPC, Thrift, HTTP request body, syslog line, netcat newline-terminated, scribe.\n. *Filesystem inputs*: files in the Spool directory, files tailed by Taildir, the agent configuration file, file-channel data and checkpoint files, optional encryption keystores.\n. *Broker inputs*: JMS messages, Kafka records (for Kafka channel and Kafka source), ZooKeeper znodes (for ZK configuration provider).\n. *Process inputs*: stdout/stderr of the command run by `ExecSource`, stdout of the command run by `ExternalProcessConfigFilter`.\n. *In-process API inputs*: `EmbeddedAgent.put(...)` and the RPC SDK's `RpcClient.append(...)` from the embedding application.\n\n=== Per-input trust table\n\nThe table is grouped by component family; one row per attacker-relevant entry point. *Caller* means \"the entity that supplies these bytes.\" A `yes` in the third column means the model treats those bytes as attacker-controlled; a `no` means they are operator/peer-trusted and a finding requiring control of them is out of model.\n\n[cols=\"2,2,1,3,3\", options=\"header\"]\n|===\n| Entry point\n| Parameter\n| Attacker-controllable?\n| Caller must enforce\n| Notes\n\n| `AvroSource` (network)\n| Avro IPC frame body, headers\n| *Yes*\n| TLS termination at source (optional); rate limiting at LB; downstream allow-list\n| Generic Java deserialization is *not* used by AvroSource (Avro has its own schema-driven wire format), but caller-supplied schema lookups should be reviewed.\n\n| `AvroSource`\n| TLS client cert (when mTLS configured)\n| Yes (the cert presented)\n| Trust store curation\n| Without mTLS, no peer authentication.\n\n| `ThriftSource` (network)\n| Thrift TBinaryProtocol message\n| *Yes*\n| Same as Avro\n|\n\n| `HTTPSource` (network)\n| HTTP request body, headers, method, path\n| *Yes*\n| Handler-specific input bounds; Jetty version pinning\n| Default handler is JSON; pluggable.\n\n| `HTTPSource`\n| `X-Forwarded-*` / IP-in-header\n| *Yes*\n| Do not trust headers for source-IP attribution without an upstream that strips them\n| Common false-friend; cross-reference \u00a74.9.\n\n| `SyslogTcpSource` / `MultiportSyslogTCPSource`\n| syslog line bytes\n| *Yes*\n| Bound `eventSize`; deprecated single-port variant should not be used (`MultiportSyslogTCPSource` preferred per `@Deprecated` on the class)\n|\n\n| `SyslogUDPSource`\n| UDP packet\n| *Yes*\n| Same as TCP; spoofable source IP is inherent\n|\n\n| `NetcatSource` / `NetcatUdpSource`\n| line bytes\n| *Yes*\n| Documented as a *testing* source; not for production. _(open \u00a74.14 \u2014 confirm)_\n|\n\n| `ExecSource`\n| stdout of the command\n| *No (operator-trusted)*\n| The command line itself is trusted; the command's *output* is trusted via that command\n| If the command shells out to a tool that reads attacker-controlled files, the *operator* has extended trust deliberately.\n\n| `SpoolDirectorySource`\n| File contents in the spool dir, file names\n| *No (operator-trusted)*\n| Path sanitization is operator's job\n|\n\n| `TaildirSource`\n| File contents being tailed, file names\n| *No (operator-trusted)*\n| Same as Spool\n|\n\n| `JMSSource`\n| JMS message body, properties\n| *Peer-authenticated*\n| Broker ACLs and broker TLS are the operator's domain\n| `providerUrl` was validated in FLUME-3437 _(documented: CHANGELOG)_ \u2014 that fix is in-scope and a regression would be `VALID`.\n\n| `ScribeSource`\n| Thrift Scribe message\n| *Yes*\n| Same as Thrift; protocol largely historical\n|\n\n| `KafkaSource` / `KafkaChannel`\n| Kafka record bytes\n| *Peer-authenticated* (if SASL/SSL configured)\n| Kafka cluster ACLs are operator's domain\n|\n\n| HTTP `ConfigurationSource` (FLUME-3335)\n| HTTP response body containing properties\n| *Yes if the URL is reachable to an attacker; otherwise no*\n| TLS, host-pinning, and *authentication* on the config endpoint\n| The most sensitive deserialization surface in the runtime \u2014 a poisoned config can pivot to anything Flume can do. Default auth is `none`.\n\n| ZooKeeper `ConfigurationSource`\n| znode contents\n| *Cluster-authenticated*\n| ZK ACLs\n|\n\n| `EnvironmentVariableConfigFilter`\n| env var name as configured\n| *No (operator-trusted)*\n| Env vars are trusted because the operator chose which to read\n|\n\n| `ExternalProcessConfigFilter`\n| stdout of `command + key`\n| *No (operator-trusted)*\n| The command itself is trusted; shell metacharacter handling in the *key* is the question \u2014 see \u00a74.11\n| `Runtime.exec(commandParts)` with the key as a separate argv element; not shell-interpreted, but the key value originates from the agent configuration (operator-trusted by \u00a74.3).\n\n| `RpcClient.append(...)` (`flume-ng-sdk`)\n| event body, headers\n| *Caller-controlled*\n| The *caller application* is responsible for not feeding adversary-controlled headers that downstream sinks would interpret unsafely (e.g., HDFS path headers \u2014 see \u00a74.11)\n|\n\n| `Log4jAppender` (`flume-ng-clients`)\n| log4j 1.x `LoggingEvent` fields\n| *Caller-controlled*\n| log4j 1.x EOL caveat applies\n|\n\n| HTTP metrics endpoint `GET /`\n| URL, method\n| *Yes (whoever can reach the port)*\n| Operator must firewall the port\n| Returns metrics; no auth, no TLS option. By design \u2014 \u00a74.9.\n|===\n\n=== Size and rate assumptions\n\n* Each source has its own batch size and per-event size cap (`maxLineLength`, `eventSize`, etc.) configurable per source. The defaults are *generous* and intended for legitimate logging volumes; the model does not promise survival under an adversary that sustains the maximum rate. _(inferred \u2014 see \u00a74.9 resource property)_\n* Channels have a `capacity` and `transactionCapacity`. Exceeding them produces `ChannelException`, not crash \u2014 that *is* a model property (\u00a74.8).\n* The bounded-vs-unbounded line on memory/CPU under adversarial input is a *wave-1 question for the PMC* \u2014 see \u00a74.9 and \u00a74.14.\n\n== Adversary model (\u00a74.7)\n\nFlume is a network service; the role split per skill \u00a74.2 applies.\n\n[cols=\"2,3\", options=\"header\"]\n|===\n| Actor\n| Capability and assumed intent (in scope unless noted)\n\n| *Operator / deployer*\n| Trusted. Writes the agent configuration, chooses which sources/sinks to enable, manages keystores and downstream credentials. *Out of scope as an adversary* \u2014 an operator who is hostile to their own deployment has won.\n\n| *Upstream event producer* (the application emitting events into an Avro/Thrift/HTTP/Syslog source)\n| *Untrusted by default* if the source is exposed to anything beyond a curated set of producer hosts. May send malformed framing, oversized events, adversarial event bodies, header injection. May spoof source-IP for UDP sources. Cannot read events sent by other producers (Flume does not echo).\n\n| *Peer Flume agent* (the AvroSink/ThriftSink at the other end of an RPC link)\n| *Authenticated-but-untrusted* in tiered deployments with mTLS. Without mTLS, no stronger than any other upstream producer. Note: Flume is *not* a Byzantine-consensus system \u2014 there is no honest-fraction threshold; each agent independently makes routing decisions.\n\n| *Downstream broker / store* (HDFS NameNode, Kafka broker, Hive metastore, etc.)\n| *Trusted via operator-configured credentials.* A compromised downstream can in principle return malicious responses that the sink's client library handles; that is the downstream's CVE, not Flume's, unless Flume's adapter mishandles a response (in which case it is in scope).\n\n| *Network attacker between agent and downstream*\n| *In scope only as far as TLS / SASL / Kerberos on the relevant link addresses them.* Flume does not provide network-attacker defenses *itself*; it relies on configuration of the underlying library.\n\n| *Co-tenant on the host*\n| *Out of scope.* A process on the same host as the Flume agent already has equivalent or greater capability than reaching the Flume API surface.\n\n| *Attacker with write access to* `flume.conf`\n| *Out of scope* per \u00a74.3. Equivalent to compromise of the operator.\n\n| *Attacker with read access to* `flume.conf`\n| *Out of scope* \u2014 file-system ACLs are the operator's domain, and the configuration contains secrets the operator chose to put there.\n\n| *Side-channel adversary* (timing, memory, cache)\n| *Out of scope.* Flume does not provide any constant-time or otherwise side-channel-resistant primitive. _(see \u00a74.9)_\n\n| *Internet-scale adversary scanning for exposed agent ports*\n| *Out of scope as a Flume responsibility.* The dormant-status policy explicitly does not commit to a hardening posture sufficient to make exposed agents safe. The model documents the assumption (\u00a74.3) that agents are not internet-facing.\n|===\n\n== Security properties Flume provides (\u00a74.8)\n\nEach property states: *(P)* the property, *(C)* the condition(s) under which it holds, *(S)* the violation symptom, *(T)* the severity tier (`security-critical` warrants a CVE / coordinated disclosure; `correctness-only` is an ordinary bug), and *(prov)* the provenance tag.\n\n[cols=\"3,2,2,1,1\", options=\"header\"]\n|===\n| Property\n| Condition\n| Violation symptom\n| Tier\n| Prov\n\n| Memory safety of the agent JVM on malformed network input\n| All in-model sources; default JDK; latest Flume release\n| JVM crash (SIGSEGV); OOM via uncontrolled allocation directly traceable to a single small input\n| security-critical\n| _(inferred \u2014 wave 1)_\n\n| At-least-once delivery for events committed to a transactional channel (File channel; Kafka channel)\n| Configured durable channel; sink that completes a `Transaction.commit()`; no operator-induced data loss (disk wipe, ZK loss)\n| Event silently dropped after `commit()`\n| correctness-only (operational); occasionally security-critical if used to defeat audit logging\n| _(documented: User Guide; (inferred) for the security-critical sub-case)_\n\n| No remote code execution from untrusted source bytes\n| All in-model network sources, with their *documented* handlers (i.e., not a user-supplied handler that itself deserializes Java objects)\n| Arbitrary code execution in the agent\n| security-critical\n| _(inferred \u2014 wave 1)_\n\n| Bounded per-event memory given documented per-source caps\n| Operator has configured the cap; source obeys it\n| Single-event memory consumption exceeds the configured cap by more than a constant factor\n| security-critical (DoS)\n| _(inferred \u2014 wave 1 and threshold question)_\n\n| `Application` reloads configuration without restarting in-flight transactions inconsistently\n| Polling configuration providers; the documented reload semantics\n| Mid-flight events lost or duplicated *beyond at-least-once tolerance*\n| correctness-only\n| _(inferred \u2014 needs maintainer confirmation that this is even a claimed property)_\n\n| TLS for in-model network sources when `ssl = true`\n| Operator has configured a valid keystore; uses a non-deprecated TLS version (Flume exposes protocol and cipher allow-lists per FLUME-3275/3276)\n| Connection accepted on plaintext when ssl=true; weak protocol negotiation despite allow-list\n| security-critical\n| _(documented: CHANGELOG FLUME-3275/3276; (inferred) for the negative claim)_\n\n| Kerberos authentication on HDFS / HBase / Hive sinks when configured\n| `flume-ng-auth` module wired; valid keytab; UGI lifecycle preserved\n| Sink writes as the wrong principal, or anonymously\n| security-critical\n| _(documented: User Guide; (inferred) for failure semantics)_\n\n| Hostname verification on outbound TLS (sinks, HTTP config provider)\n| `LaxHostnameVerifier` not explicitly selected; default JSSE behavior\n| Sink accepts a server cert for the wrong host\n| security-critical\n| _(documented: FLUME-3315; FLUME-3437 for providerUrl validation)_\n\n| File-channel encryption-at-rest provides confidentiality of channel data against an attacker who later reads the media\n| `encryption.activeKey` set; keystore protected; AES/CTR/NoPadding provider\n| Channel data recoverable from raw files\n| security-critical (in deployments that opted in)\n| _(documented: encryption package)_\n\n| The agent does *not* deserialize untrusted Java `ObjectInputStream` data from any in-model source\n| All in-model sources with their documented handlers\n| Demonstration that any in-model code path reaches `ObjectInputStream.readObject()` on attacker-controlled bytes\n| security-critical\n| _(inferred \u2014 wave 1; if the PMC ratifies, this is the headline \u00a74.8 property)_\n|===\n\n[NOTE]\n.Threshold for resource properties\n====\nThe skill explicitly requires the PMC to draw a categorical line on resource consumption. Until the PMC rules in \u00a74.14, this model takes the position: *\"super-linear CPU or memory growth in a single event's size beyond the documented cap is a bug; sustained high request rates from many connections are not.\"* That is the *proposed answer* for the maintainer to confirm, correct, or replace.\n====\n\n== Security properties Flume does *not* provide (\u00a74.9)\n\n. *No authentication on the JSON/Prometheus HTTP metrics endpoint.* The endpoint is by design unauthenticated and unencrypted. _(documented: `HTTPMetricsServer` source has no auth or TLS code paths.)_ Operators must firewall the port.\n. *No authorization model for sources.* Any client that can connect to a source's port (subject to optional TLS / mTLS) can submit events. Flume does not provide a per-tenant identity, a per-source ACL, or rate limiting.\n. *No data-content authentication.* Event bodies and headers are not authenticated end-to-end. A producer that has compromised an intermediate agent can inject events; the model does not provide a MAC over events.\n. *No defense against decompression bombs or other amplification within event bodies* if a sink/serializer decompresses (e.g., Snappy, Gzip via Hadoop libraries in the HDFS sink path). _(inferred \u2014 wave 2)_\n. *No constant-time primitives.* Flume does not use cryptographic comparisons in any documented authentication path; if it did, side-channel adversaries are still out of model per \u00a74.7.\n. *No defense against an attacker who controls headers used by sinks to choose output paths.* HDFS `path = /data/%{client}/log` will route based on attacker-supplied `client` header; that is a \u00a74.11 misuse, not a Flume property.\n\n=== False-friend properties\n\nThese features are commonly mistaken for security properties they do not satisfy:\n\n* *\"Reliable delivery\"* \u2014 this is a *reliability* property (no data loss on agent restart or channel failure), not an *integrity* property (no data tampering on the wire). Reliable delivery does not authenticate the sender.\n* *The TLS option on a source* \u2014 this protects the bytes-on-the-wire between *the configured peer and the source*. It does not authenticate the *producer application* upstream of that peer, and unless mTLS is enabled with a curated trust store, it does not authenticate the peer at all.\n* *Kerberos on HDFS/Hive/HBase sinks* \u2014 authenticates the *agent* to the downstream. It says nothing about the event producer's identity.\n* *File-channel encryption* \u2014 protects against *recovery from disk media* later. It does not protect against any in-host adversary, because the key is in memory and the data passes through plaintext channels (interceptors, serializers) before being written.\n* *The agent configuration file* \u2014 is *not* a sandbox. Loading a plugin from the classpath is full code-execution-as-the-agent. There is no signing or verification of plugin JARs.\n\n=== Well-known attack classes Flume does not defend against on its own\n\n. *Log injection / forging downstream log lines.* If event bodies are written to a log-aggregation store that interprets newlines or markers (e.g., a tail-following tool), an attacker who controls event content can inject lines. Operators using Flume as part of an audit pipeline must apply integrity controls at the audit layer.\n. *Header injection into headers that sinks treat as routing data.* As above; the canonical case is HDFS path placeholders. (\u00a74.11)\n. *Replay attacks.* Re-sending captured Avro/Thrift frames will deliver duplicate events. The downstream is responsible for idempotency or de-duplication.\n. *Source-IP spoofing for UDP-based sources* (`SyslogUDPSource`, `NetcatUdpSource`). Inherent to UDP; cannot be addressed at this layer.\n. *Resource exhaustion via many slow connections* against TCP sources. Mitigations belong at the network / LB layer.\n. *Adversary-influenced timing of Kerberos ticket renewal or TLS handshake* leading to outage. Out of model per \u00a74.7 (side-channel) and \u00a74.9 (no liveness guarantees under load).\n\n== Downstream responsibilities (\u00a74.10)\n\nA short, action-oriented list of what an *operator* of a Flume agent must do for \u00a74.5\u2013\u00a74.7 to hold. This is a *contract*, not a how-to.\n\n. *Do not expose any source's listening port to the public internet, or to any network that includes untrusted producers, unless mTLS is configured and the trust store is curated.*\n. *Enable TLS (`ssl = true`) on every network source that listens outside `localhost`.*\n. *Firewall the monitoring port.* It returns operational metadata with no authentication.\n. *If using the HTTP configuration provider, set HTTPS and basic-auth credentials.* The configuration the agent fetches is fully privileged.\n. *Run the agent as an unprivileged service account.* Do not run as root, even when binding to syslog port 514; use OS capabilities or a port-forwarder.\n. *Treat `flume.conf` as a secrets-bearing file* (mode 0600, owned by the service account). Use config-filters (`environment-variable`, `external-process`, or Hadoop credential store) to keep raw secrets out of the file.\n. *Validate operator-supplied inputs at the operator boundary.* `ExecSource` commands, Spool directory contents, and Taildir glob patterns are operator-trusted by the model \u2014 if the operator's processes accept those inputs from elsewhere, the operator must enforce sanitization upstream.\n. *Pin and review dependency versions when building from source.* The dormant project does not commit to keeping transitive dependencies patched.\n. *Do not use Flume for audit logging without an independent integrity control on the audit pipeline* (downstream signing, append-only storage with hash chaining).\n. *Do not place the agent inside an application that treats events as security-sensitive without independent authentication.*\n. *Plan migration.* Per the dormant-status policy, new releases are unlikely. Operators should treat the current Flume version as the last version they will receive, and plan replacement on a timescale that suits the threats their deployment is exposed to.\n\n== Known misuse patterns (\u00a74.11)\n\n. *Passing attacker-influenced event headers into HDFS / file-roll `path` templates.* Headers like `%{client}` or `%{tenant}` are expanded by sinks; if the attacker controls them, they choose the output path. *Mitigation:* whitelist allowed header values via an interceptor; use only operator-set headers in path templates.\n. *Using `NetcatSource` for production traffic.* Documented as for testing; no protocol framing, no flow control. *Mitigation:* use Avro or Syslog instead.\n. *Using `ExecSource` with `tail -F` on user-writable files.* The source then trusts the contents of an operator-claimed-trusted file that is actually attacker-writable. *Mitigation:* `Taildir` plus filesystem ACLs.\n. *Wiring `MemoryChannel` between sources and sinks that the operator considers \"reliable.\"* Memory channel loses events on JVM crash by design. *Mitigation:* File channel for any reliability requirement.\n. *Running multiple agents on the same host with overlapping channel directories.* File channel is single-writer; concurrent access corrupts state.\n. *Putting cleartext credentials in `flume.conf`.* The file is operator-trusted but routinely shipped via configuration management tools that may not protect it as a secret.\n. *Treating the gzip-magic file-extension on rolled HDFS files as integrity.* The HDFS sink can write gzipped output; the compression footer is *not* a MAC.\n. *Enabling the HTTP configuration provider without TLS or auth.* The provider will then re-load any properties any attacker can serve to it.\n. *Replaying a captured Avro frame.* If the downstream is not idempotent and Flume is in the audit path, the attacker has now duplicated an event into the audit record.\n. *Loading a plugin JAR from a writable directory.* The agent classpath is operator-trusted; if the operator allows write access, anyone with that access has code-execution-as-the-agent.\n\n== Known non-findings \u2014 recurring false positives (\u00a74.11a)\n\nThis section is the *highest-leverage* artifact for triage. Entries are findings that tools or researchers raise repeatedly that are *not* bugs under the model.\n\n. *\"`HTTPMetricsServer` has no authentication\"* \u2192 \u00a74.9 / \u00a74.10. By design; operator-firewalled. Disposition: `BY-DESIGN: property-disclaimed`.\n. *\"`ExecSource` allows arbitrary command execution\"* \u2192 \u00a74.6 / \u00a74.3. The command is operator-supplied; reaching it requires write access to `flume.conf`. Disposition: `OUT-OF-MODEL: trusted-input`.\n. *\"`ExternalProcessConfigFilter` enables command injection\"* \u2192 \u00a74.6. Same as above; the command is operator-supplied and the key is operator-supplied (via the configuration). The implementation uses `Runtime.exec(commandParts)` with the key as a separate `argv` element, not shell-interpreted. Disposition: `OUT-OF-MODEL: trusted-input`.\n. *\"Default `ssl = false` on `AvroSource` permits MITM\"* \u2192 \u00a74.10. Operator must enable. Disposition: `OUT-OF-MODEL: non-default-build` (with the \u00a74.5a maintainer ruling pending).\n. *\"`MemoryChannel` loses data on crash\"* \u2192 \u00a74.9. By design; channel choice is the operator's. Disposition: `BY-DESIGN: property-disclaimed`.\n. *\"Source-IP header is trusted\"* \u2192 \u00a74.6. Documented false-friend; producer-supplied headers are attacker-controllable. Disposition: `OUT-OF-MODEL: trusted-input` (the field is in the trust-untrusted list).\n. *\"CVE in transitive `` reachable in theory\"* \u2192 \u00a74.3. In-model only with a demonstration of reachability via a documented Flume input. A static-analysis hit on a dependency, on its own, is `OUT-OF-MODEL: unsupported-component` for the dependency.\n. *\"Log4j 1.x appender vulnerable to CVE-X in log4j 1\"* \u2192 \u00a74.3. log4j 1.x is EOL; only bugs in Flume's appender code are in scope.\n. *\"`LaxHostnameVerifier` exists in the codebase\"* \u2192 \u00a74.5a. It is *not* the default; only deployments that opt in are affected, and those report under `OUT-OF-MODEL: non-default-build`.\n. *\"`SyslogUDPSource` accepts spoofed source IPs\"* \u2192 \u00a74.9. Inherent to UDP. Disposition: `BY-DESIGN: property-disclaimed`.\n\n(Expand on each iteration with whatever the next round of scanners or researchers reports.)\n\n== Conditions that would change this model (\u00a74.12)\n\n. *Flume returns to ACTIVE status in Logging Services.* Many of the \"we will not patch dependencies\" assumptions soften.\n. *A new source, sink, or channel is added in core.* New row in the \u00a74.2 table; new rows in the \u00a74.6 trust table.\n. *The default for any \u00a74.5a knob flips* (e.g., `ssl` becomes default-on).\n. *A configuration mechanism gains a remote, network-fetched mode without authentication by default* (already the case for HTTP config provider \u2014 handled).\n. *A vulnerability report cannot be cleanly assigned to one of the \u00a74.13 dispositions.* That is evidence of a `MODEL-GAP` and the model must be revised before the report is closed.\n. *A bundled third-party dependency adds a new wire format that Flume now serves* (e.g., a new sink accepts inbound traffic of a new shape).\n\n== Triage dispositions (\u00a74.13)\n\n[cols=\"2,3,2\", options=\"header\"]\n|===\n| Disposition\n| Meaning for Flume\n| Licensed by\n\n| `VALID`\n| Violates a \u00a74.8 property via an in-scope adversary and input. Reported per `apache.org/security/`. Outcome may be: an advisory only (no fix release), a fix on `main`, or in exceptional cases a new release if the PMC decides.\n| \u00a74.8, \u00a74.6, \u00a74.7\n\n| `VALID-HARDENING`\n| No \u00a74.8 property is violated, but a \u00a74.11 misuse is easy enough to warrant a fix at maintainer discretion. Reported privately. Typically no CVE.\n| \u00a74.11\n\n| `OUT-OF-MODEL: trusted-input`\n| Requires attacker control of a parameter the \u00a74.6 table marks trusted. Closed with a citation to the row.\n| \u00a74.6, \u00a74.3\n\n| `OUT-OF-MODEL: adversary-not-in-scope`\n| Requires a capability \u00a74.7 excludes (co-tenant, write-access to `flume.conf`, side-channel).\n| \u00a74.7\n\n| `OUT-OF-MODEL: unsupported-component`\n| Lands in `flume-ng-tests`, in a `*.template` config sample, in log4j 1.x itself, or in a transitive dependency without a demonstrated reachable path.\n| \u00a74.3\n\n| `OUT-OF-MODEL: non-default-build`\n| Only manifests with a knob the PMC has designated dev-only or operator-must-flip (per \u00a74.5a).\n| \u00a74.5a\n\n| `BY-DESIGN: property-disclaimed`\n| Concerns a property \u00a74.9 explicitly does not provide (unauthenticated metrics, no constant-time, no MAC, etc.).\n| \u00a74.9\n\n| `KNOWN-NON-FINDING`\n| Matches a documented entry in \u00a74.11a.\n| \u00a74.11a\n\n| `MODEL-GAP`\n| Cannot be routed to any of the above. Triggers a \u00a74.12 revision before the report is closed.\n| \u00a74.12\n|===\n\n== Open questions for the PMC (\u00a74.14)\n\nGrouped into waves of 3\u20137 per skill \u00a73.2. Each question states a *proposed answer* for the PMC to confirm, correct, or strike.\n\n=== Wave 1 \u2014 scope, dormant-posture, security envelope\n\n. *Proposed:* \"An exposed Avro/Thrift/HTTP/Syslog source on the public internet is *out of model* \u2014 Flume was designed for trusted-network deployments.\" \u2192 \u00a74.3, \u00a74.10. *Confirm?*\n. *Proposed:* \"An attacker with write access to `flume.conf` is *out of model* \u2014 equivalent to operator compromise.\" \u2192 \u00a74.3, \u00a74.7. *Confirm?*\n. *Proposed:* \"The headline \u00a74.8 property is: *No documented Flume source on its default handler invokes Java `ObjectInputStream.readObject()` on attacker-controlled bytes.* A counterexample is `VALID` and security-critical.\" \u2192 \u00a74.8. *Confirm; if so, is there any source/handler combination this does **not** cover?*\n. *Proposed:* \"Resource-consumption threshold: *super-linear CPU or memory growth in a single event beyond the documented per-event cap is a bug; sustained high connection rates are not.*\" \u2192 \u00a74.8 resource row, \u00a74.9. *Confirm threshold or supply a different one.*\n. *Proposed:* \"Defaults that are less-secure (`ssl = false`, HTTP config provider `auth = none`, no Kerberos on HDFS sink) are *operator-must-enable*, not *supported production posture* \u2014 therefore findings against pure defaults are `OUT-OF-MODEL: non-default-build`.\" \u2192 \u00a74.5a, \u00a74.10, \u00a74.13. *Confirm? If the PMC disagrees on any of these, they become `VALID`.*\n. *Proposed:* \"The dormant-status policy stands: security reports are triaged and answered, but the typical response is an advisory and migration guidance, not a fix release. The threat model documents this honestly rather than implying a fix pipeline.\" \u2192 \u00a74.1. *Confirm?*\n\n=== Wave 2 \u2014 environment, negative claims about host effects\n\n. *Proposed:* \"Flume does not install signal handlers beyond JVM defaults; does not `fork()`/`exec()` except via `ExecSource` / `ExternalProcessConfigFilter` / downstream client libraries; does not read env vars except via the documented filters.\" \u2192 \u00a74.5. *Are any of these wrong? Any unconditional env-var read at startup we should add to the list?*\n. *Proposed:* \"Flume targets Java 8 for build and execution; running on JDK 11/17/21 is best-effort and security claims weaken if a JDK-specific issue is involved.\" \u2192 \u00a74.5. *Confirm or extend.*\n. *Proposed:* \"Flume on Windows is supported but not first-class for security purposes; the wave-1 threat model applies to Linux deployments.\" \u2192 \u00a74.5. *Confirm?*\n. *Proposed:* \"`NetcatSource` is *testing-only*; a finding that requires a NetcatSource in production is `OUT-OF-MODEL: unsupported-component` or `VALID-HARDENING` at best.\" \u2192 \u00a74.11. *Confirm \u2014 or is Netcat considered a real production option somewhere?*\n. *Proposed:* \"POSIX rename/atomicity is assumed by File channel and Taildir; non-POSIX filesystems are best-effort.\" \u2192 \u00a74.5. *Confirm?*\n\n=== Wave 3 \u2014 sinks, deserialization, false friends\n\n. *Proposed:* \"Sinks delegate authentication and transport security to their downstream's client library (HDFS/Kerberos, Kafka/SASL, Hive/Kerberos, HBase/Kerberos). A Flume-specific failure mode is in scope only when Flume's adapter code mishandles a response or credential lifecycle (e.g., FLUME-3049 type).\" \u2192 \u00a74.4. *Confirm?*\n. *Proposed:* \"Default JSON handler on `HTTPSource` does not invoke Java native deserialization; pluggable handlers may, and a user-supplied handler is the user's problem.\" \u2192 \u00a74.8. *Confirm? Is the JSON handler the only one in core that's explicitly safe?*\n. *Proposed:* \"Compression handling in HDFS sink (Snappy/Gzip via Hadoop libs) does not include decompression-bomb defenses at the Flume layer; that is `BY-DESIGN: property-disclaimed`.\" \u2192 \u00a74.9. *Confirm or reject?*\n. *Proposed:* \"`LaxHostnameVerifier` is opt-in only and never wired by default; any code path that selects it implicitly is itself a bug.\" \u2192 \u00a74.5a. *Verify against current code.*\n. *Proposed:* \"Header injection into HDFS path templates is a \u00a74.11 misuse, not a \u00a74.8 violation \u2014 the responsibility is on the operator to use only operator-set headers in path templates.\" \u2192 \u00a74.11. *Confirm; should we add an interceptor recommendation or stay descriptive?*\n\n=== Wave 4 \u2014 meta (publication and coexistence)\n\n. *Where does this document live?* Proposed: `docs/threat-model.adoc` in the repo, linked from the README banner above the dormant notice. *Confirm or pick alternative.*\n. *Does a thin `SECURITY.md` get added that points to (a) this file and (b) `apache.org/security/`?* Proposed: *yes.* *Confirm.*\n. *Versioning binding* \u2014 proposed: \"the model is tagged against `main` at the date in the header; a report against an older release is triaged using the model in effect at that release, but practically the dormant-state model is what will be used.\" *Confirm or refine.*\n. *Sign-off process* \u2014 proposed: a LAZY-CONSENSUS vote on `dev@logging.apache.org` once the wave-1/2/3 answers are folded in. *Confirm.*\n\n== Machine-readable companion (\u00a74.15)\n\nA `threat-model.yaml` sidecar will be derived from this document once the PMC ratifies wave 1, so triage tooling (and AI-assisted triage) can consume the \u00a74.2 component-family table, the \u00a74.6 trust table, the \u00a74.8 properties, the \u00a74.9 disclaimers, and the \u00a74.11a known-non-findings without prose parsing. Deferred until the model is stable. _(open \u00a74.14)_\n\n== Self-check status\n\n* [x] Every section is substantive or marked N/A with reason. *(N/A used where Flume has no `contrib/`.)*\n* [x] No bullet would be at home in a code review.\n* [x] No bullet restates README prose.\n* [x] Every non-trivial claim carries a `(documented)` / `(maintainer)` / `(inferred)` tag (or is in a clearly-marked \"Proposed\" question).\n* [x] Header reports a draft-confidence count.\n* [x] Every `(inferred)` tag has a matching open question in \u00a74.14.\n* [x] Component families are modeled separately (\u00a74.2 table).\n* [x] Build / configuration variants are enumerated (\u00a74.5a).\n* [x] \u00a74.9 and \u00a74.10 are at least as substantive as \u00a74.8.\n* [x] \u00a74.9 names false-friend properties and well-known attack classes.\n* [x] \u00a74.6 is a table, not prose.\n* [x] Every \u00a74.8 property carries violation symptom and severity tier; the resource property states a threshold (proposed, pending PMC ruling).\n* [x] \u00a74.11a is populated.\n* [x] \u00a74.13 enumerates dispositions with section citations.\n* [ ] *Pending:* a reader who has never seen Flume can answer \"what threats has Flume taken responsibility for, and which have been left to me?\" \u2014 **gated on PMC wave-1 answers.**\n* [ ] *Pending:* a triager can route any finding to exactly one \u00a74.13 row citing a section, without consulting the PMC \u2014 **gated on PMC wave-1 answers.**\n* [x] The document fits in one sitting (eight pages or so). Acceptable.", "creation_timestamp": "2026-05-13T14:44:21.000000Z"}