Search criteria

8 vulnerabilities found for Apache OpenNLP by Apache Software Foundation

CVE-2026-42440 (GCVE-0-2026-42440)

Vulnerability from nvd – Published: 2026-05-04 16:40 – Updated: 2026-05-05 16:03
VLAI
Title
Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader
Summary
OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader  Versions Affected:  before 2.5.9 before 3.0.0-M3  Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.   Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.
Severity
No CVSS data available.
CWE
  • CWE-789 - Memory Allocation with Excessive Size Value
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 0 , < 2.5.9 (semver)
Affected: 3.0 , < 3.0.0-M3 (semver)
Create a notification for this product.
Credits
Subramanian S
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2026-05-04T17:37:00.275Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "url": "http://www.openwall.com/lists/oss-security/2026/05/01/21"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "HIGH",
              "baseScore": 7.5,
              "baseSeverity": "HIGH",
              "confidentialityImpact": "NONE",
              "integrityImpact": "NONE",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
              "version": "3.1"
            }
          },
          {
            "other": {
              "content": {
                "id": "CVE-2026-42440",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-05T16:00:26.146388Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-05T16:03:03.237Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "collectionURL": "https://repo.maven.apache.org/maven2",
          "defaultStatus": "unaffected",
          "packageName": "org.apache.opennlp:opennlp-tools",
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "lessThan": "2.5.9",
              "status": "affected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThan": "3.0.0-M3",
              "status": "affected",
              "version": "3.0",
              "versionType": "semver"
            }
          ]
        }
      ],
      "credits": [
        {
          "lang": "en",
          "type": "finder",
          "value": "Subramanian S"
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "supportingMedia": [
            {
              "base64": false,
              "type": "text/html",
              "value": "\u003cp\u003e\u003cb\u003eOOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader\u0026nbsp;\u003c/b\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eVersions Affected:\u003c/b\u003e\u0026nbsp;\u003c/p\u003e\u003cp\u003ebefore 2.5.9\u003c/p\u003e\u003cp\u003ebefore 3.0.0-M3\u0026nbsp;\u003c/p\u003e\u003cp\u003e\u003cb\u003eDescription:\u003c/b\u003e\u003c/p\u003e\n\u003cp\u003eThe \u003ccode\u003eAbstractModelReader\u003c/code\u003e methods \u003ccode\u003egetOutcomes()\u003c/code\u003e, \u003ccode\u003egetOutcomePatterns()\u003c/code\u003e, and \u003ccode\u003egetPredicates()\u003c/code\u003e each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (\u003ccode\u003enew String[numOutcomes]\u003c/code\u003e, \u003ccode\u003enew int[numOCTypes][]\u003c/code\u003e, \u003ccode\u003enew String[NUM_PREDS]\u003c/code\u003e) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source.\u003c/p\u003e\n\u003cp\u003eA crafted \u003ccode\u003e.bin\u003c/code\u003e model file in which any of these count fields is set to \u003ccode\u003eInteger.MAX_VALUE\u003c/code\u003e (or any value large enough to exhaust the available heap) triggers an \u003ccode\u003eOutOfMemoryError\u003c/code\u003e at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, \u003ccode\u003egetOutcomes()\u003c/code\u003e is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a \u003ccode\u003e.bin\u003c/code\u003e model is affected, including direct use of \u003ccode\u003eGenericModelReader\u003c/code\u003e and any higher-level component that delegates to it during model load.\u003c/p\u003e\n\u003cp\u003eThe practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.\u0026nbsp;\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cb\u003eMitigation:\u003c/b\u003e\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e2.x users should upgrade to 2.5.9.\u003c/li\u003e\n\u003cli\u003e3.x users should upgrade to 3.0.0-M3.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e\u003cb\u003eNote:\u003c/b\u003e The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an \u003ccode\u003eIllegalArgumentException\u003c/code\u003e to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the \u003ccode\u003eOPENNLP_MAX_ENTRIES\u003c/code\u003e system property to the desired positive integer (e.g. \u003ccode\u003e-DOPENNLP_MAX_ENTRIES=50000000\u003c/code\u003e); invalid or non-positive values fall back to the default.\u003c/p\u003e\n\u003cp\u003eUsers who cannot upgrade immediately should treat all \u003ccode\u003e.bin\u003c/code\u003e model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.\u0026nbsp;\u003c/p\u003e"
            }
          ],
          "value": "OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader\u00a0\n\nVersions Affected:\u00a0\n\nbefore 2.5.9\n\nbefore 3.0.0-M3\u00a0\n\nDescription:\n\n\nThe AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source.\n\n\nA crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load.\n\n\nThe practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.\u00a0\u00a0\n\n\nMitigation:\n\n\n\n  *  2.x users should upgrade to 2.5.9.\n\n  *  3.x users should upgrade to 3.0.0-M3.\n\n\n\n\nNote: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default.\n\n\nUsers who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks."
        }
      ],
      "metrics": [
        {
          "other": {
            "content": {
              "text": "moderate"
            },
            "type": "Textual description of severity"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-789",
              "description": "CWE-789: Memory Allocation with Excessive Size Value",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-04T16:40:32.503Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "vendor-advisory"
          ],
          "url": "https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo"
        }
      ],
      "source": {
        "defect": [
          "OPENNLP-1821"
        ],
        "discovery": "UNKNOWN"
      },
      "title": "Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader",
      "x_generator": {
        "engine": "Vulnogram 0.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2026-42440",
    "datePublished": "2026-05-04T16:40:32.503Z",
    "dateReserved": "2026-04-27T12:43:14.347Z",
    "dateUpdated": "2026-05-05T16:03:03.237Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

CVE-2026-42027 (GCVE-0-2026-42027)

Vulnerability from nvd – Published: 2026-05-04 16:43 – Updated: 2026-05-05 16:02
VLAI
Title
Apache OpenNLP: Arbitrary Class Instantiation via Model Manifest in ExtensionLoader
Summary
Arbitrary Class Instantiation via Model Manifest in Apache OpenNLP ExtensionLoader Versions Affected: before 2.5.9, before 3.0.0-M3 Description:  The ExtensionLoader.instantiateExtension(Class, String) method loads a class by its fully-qualified name via Class.forName() and invokes its no-arg constructor, with the class name sourced from the manifest.properties entry of a model archive. The existing isAssignableFrom check correctly rejects classes that are not subtypes of the expected extension interface (BaseToolFactory for factory=, ArtifactSerializer for serializer-class-*), but the check runs after Class.forName() has already loaded and initialized the named class. Class.forName() with default initialization semantics executes the target class's static initializer before returning, so an attacker who can supply a crafted model archive can cause the static initializer of any class on the classpath to run during model loading, regardless of whether that class passes the subsequent type check. Exploitation requires a class with attacker-useful side effects in its static initializer (for example, JNDI lookup, outbound network I/O, or filesystem access) to be present on the classpath, so this is not a drop-in remote code execution; however, the attack surface grows as third-party model distribution becomes more common (community model repositories, Hugging Face-style sharing), where users routinely load model files from origins they do not control. A secondary, narrower vector affects deployments that ship legitimate BaseToolFactory or ArtifactSerializer subclasses with side-effecting no-arg constructors: a malicious manifest can name such a class and force its constructor to run during model load. Mitigation:  * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces a package-prefix allowlist that is consulted before Class.forName() is invoked, so the static initializer of a disallowed class is never executed. Classes under the opennlp. prefix remain permitted by default. Deployments that load models referencing factories or serializers outside opennlp.* must opt those packages in, either programmatically via ExtensionLoader.registerAllowedPackage(String) before the first model load, or by setting the OPENNLP_EXT_ALLOWED_PACKAGES system property to a comma-separated list of allowed package prefixes. Users who cannot upgrade immediately should ensure that all model files are sourced from trusted origins and should audit their classpath for classes with side-effecting static initializers or constructors, particularly any that perform JNDI lookups, network requests, or filesystem operations during class initialization.
Severity
No CVSS data available.
CWE
  • CWE-470 - Use of Externally-Controlled Input to Select Classes or Code ('Unsafe Reflection')
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 0 , < 2.5.9 (semver)
Affected: 3.0 , < 3.0.0-M3 (semver)
Create a notification for this product.
Credits
Subramanian S
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2026-05-04T17:36:56.492Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "url": "http://www.openwall.com/lists/oss-security/2026/05/01/20"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "HIGH",
              "baseScore": 9.8,
              "baseSeverity": "CRITICAL",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "HIGH",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
              "version": "3.1"
            }
          },
          {
            "other": {
              "content": {
                "id": "CVE-2026-42027",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "total"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-05T16:01:56.421468Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-05T16:02:56.683Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "collectionURL": "https://repo.maven.apache.org/maven2/",
          "defaultStatus": "unaffected",
          "packageName": "org.apache.opennlp:opennlp-tools",
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "lessThan": "2.5.9",
              "status": "affected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThan": "3.0.0-M3",
              "status": "affected",
              "version": "3.0",
              "versionType": "semver"
            }
          ]
        }
      ],
      "credits": [
        {
          "lang": "en",
          "type": "finder",
          "value": "Subramanian S"
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "supportingMedia": [
            {
              "base64": false,
              "type": "text/html",
              "value": "\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eArbitrary Class Instantiation via Model Manifest in Apache OpenNLP ExtensionLoader\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eVersions Affected: \u003c/b\u003ebefore 2.5.9, before 3.0.0-M3\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eDescription:\u003c/b\u003e\u0026nbsp;\u003c/p\u003e\u003cp\u003eThe \u003ccode\u003eExtensionLoader.instantiateExtension(Class, String)\u003c/code\u003e\u0026nbsp;method loads a class by its fully-qualified name via \u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;and invokes its no-arg constructor, with the class name sourced from the \u003ccode\u003emanifest.properties\u003c/code\u003e\u0026nbsp;entry of a model archive. The existing \u003ccode\u003eisAssignableFrom\u003c/code\u003e\u0026nbsp;check correctly rejects classes that are not subtypes of the expected extension interface (\u003ccode\u003eBaseToolFactory\u003c/code\u003e\u0026nbsp;for \u003ccode\u003efactory=\u003c/code\u003e, \u003ccode\u003eArtifactSerializer\u003c/code\u003e\u0026nbsp;for \u003ccode\u003eserializer-class-*\u003c/code\u003e), but the check runs \u003cem\u003e\u003cb\u003eafter\u003c/b\u003e\u003c/em\u003e\u0026nbsp;\u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;has already loaded and initialized the named class. \u003c/p\u003e\u003cp\u003e\u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;with default initialization semantics executes the target class\u0027s static initializer before returning, so an attacker who can supply a crafted model archive can cause the static initializer of any class on the classpath to run during model loading, regardless of whether that class passes the subsequent type check. \u003c/p\u003e\u003cp\u003eExploitation requires a class with attacker-useful side effects in its static initializer (for example, JNDI lookup, outbound network I/O, or filesystem access) to be \u003cb\u003epresent on the classpath\u003c/b\u003e, so this is not a drop-in remote code execution; however, the attack surface grows as third-party model distribution becomes more common (community model repositories, Hugging Face-style sharing), where users routinely load model files from origins they do not control. A secondary, narrower vector affects deployments that ship legitimate \u003ccode\u003eBaseToolFactory\u003c/code\u003e\u0026nbsp;or \u003ccode\u003eArtifactSerializer\u003c/code\u003e\u0026nbsp;subclasses with side-effecting no-arg constructors: a malicious manifest can name such a class and force its constructor to run during model load.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eMitigation:\u003c/b\u003e\u0026nbsp;\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cul\u003e\u003cli\u003e2.x users should upgrade to 2.5.9. \u003c/li\u003e\u003cli\u003e3.x users should upgrade to 3.0.0-M3. \u003c/li\u003e\u003c/ul\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eNote: The fix introduces a package-prefix allowlist that is consulted before \u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;is invoked, so the static initializer of a disallowed class is never executed. Classes under the \u003ccode\u003eopennlp.\u003c/code\u003e\u0026nbsp;prefix remain permitted by default. Deployments that load models referencing factories or serializers outside \u003ccode\u003eopennlp.*\u003c/code\u003e\u0026nbsp;must opt those packages in, either programmatically via \u003ccode\u003eExtensionLoader.registerAllowedPackage(String)\u003c/code\u003e\u0026nbsp;before the first model load, or by setting the \u003ccode\u003e\u003cb\u003eOPENNLP_EXT_ALLOWED_PACKAGES\u003c/b\u003e\u003c/code\u003e\u0026nbsp;system property to a comma-separated list of allowed package prefixes. \u003c/p\u003e\u003cp\u003eUsers who cannot upgrade immediately should ensure that all model files are sourced from \u003cb\u003etrusted origins\u003c/b\u003e\u0026nbsp;and should audit their classpath for classes with side-effecting static initializers or constructors, particularly any that perform JNDI lookups, network requests, or filesystem operations during class initialization.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cbr\u003e\u003cbr\u003e"
            }
          ],
          "value": "Arbitrary Class Instantiation via Model Manifest in Apache OpenNLP ExtensionLoader\n\n\n\n\n\nVersions Affected: before 2.5.9, before 3.0.0-M3\n\n\n\n\n\nDescription:\u00a0\n\nThe ExtensionLoader.instantiateExtension(Class, String)\u00a0method loads a class by its fully-qualified name via Class.forName()\u00a0and invokes its no-arg constructor, with the class name sourced from the manifest.properties\u00a0entry of a model archive. The existing isAssignableFrom\u00a0check correctly rejects classes that are not subtypes of the expected extension interface (BaseToolFactory\u00a0for factory=, ArtifactSerializer\u00a0for serializer-class-*), but the check runs after\u00a0Class.forName()\u00a0has already loaded and initialized the named class. \n\nClass.forName()\u00a0with default initialization semantics executes the target class\u0027s static initializer before returning, so an attacker who can supply a crafted model archive can cause the static initializer of any class on the classpath to run during model loading, regardless of whether that class passes the subsequent type check. \n\nExploitation requires a class with attacker-useful side effects in its static initializer (for example, JNDI lookup, outbound network I/O, or filesystem access) to be present on the classpath, so this is not a drop-in remote code execution; however, the attack surface grows as third-party model distribution becomes more common (community model repositories, Hugging Face-style sharing), where users routinely load model files from origins they do not control. A secondary, narrower vector affects deployments that ship legitimate BaseToolFactory\u00a0or ArtifactSerializer\u00a0subclasses with side-effecting no-arg constructors: a malicious manifest can name such a class and force its constructor to run during model load.\n\n\n\n\n\nMitigation:\u00a0\n\n\n\n  *  2.x users should upgrade to 2.5.9. \n  *  3.x users should upgrade to 3.0.0-M3. \n\n\n\n\nNote: The fix introduces a package-prefix allowlist that is consulted before Class.forName()\u00a0is invoked, so the static initializer of a disallowed class is never executed. Classes under the opennlp.\u00a0prefix remain permitted by default. Deployments that load models referencing factories or serializers outside opennlp.*\u00a0must opt those packages in, either programmatically via ExtensionLoader.registerAllowedPackage(String)\u00a0before the first model load, or by setting the OPENNLP_EXT_ALLOWED_PACKAGES\u00a0system property to a comma-separated list of allowed package prefixes. \n\nUsers who cannot upgrade immediately should ensure that all model files are sourced from trusted origins\u00a0and should audit their classpath for classes with side-effecting static initializers or constructors, particularly any that perform JNDI lookups, network requests, or filesystem operations during class initialization."
        }
      ],
      "metrics": [
        {
          "other": {
            "content": {
              "text": "moderate"
            },
            "type": "Textual description of severity"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-470",
              "description": "CWE-470 Use of Externally-Controlled Input to Select Classes or Code (\u0027Unsafe Reflection\u0027)",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-04T16:43:12.583Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "vendor-advisory"
          ],
          "url": "https://lists.apache.org/thread/ltlo4powjfc0w2w2yyl1o5tc7q1gcb2y"
        }
      ],
      "source": {
        "defect": [
          "OPENNLP-1820"
        ],
        "discovery": "UNKNOWN"
      },
      "title": "Apache OpenNLP: Arbitrary Class Instantiation via Model Manifest in ExtensionLoader",
      "x_generator": {
        "engine": "Vulnogram 0.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2026-42027",
    "datePublished": "2026-05-04T16:43:12.583Z",
    "dateReserved": "2026-04-23T14:21:25.317Z",
    "dateUpdated": "2026-05-05T16:02:56.683Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

CVE-2026-40682 (GCVE-0-2026-40682)

Vulnerability from nvd – Published: 2026-05-04 16:55 – Updated: 2026-05-05 15:02
VLAI
Title
Apache OpenNLP: XXE via Dictionary Parsing in DictionaryEntryPersistor
Summary
XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor Versions Affected: before 2.5.9, before 3.0.0-M3 Description: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario. Mitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.
Severity
No CVSS data available.
CWE
  • CWE-611 - Improper Restriction of XML External Entity Reference
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 0 , < 2.5.9 (semver)
Affected: 3.0 , < 3.0.0-M3 (semver)
Create a notification for this product.
Credits
Subramanian S
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2026-05-04T17:36:52.681Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "url": "http://www.openwall.com/lists/oss-security/2026/05/01/19"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "NONE",
              "baseScore": 9.1,
              "baseSeverity": "CRITICAL",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "HIGH",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N",
              "version": "3.1"
            }
          },
          {
            "other": {
              "content": {
                "id": "CVE-2026-40682",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "total"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-05T15:01:49.614474Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-05T15:02:14.483Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "collectionURL": "https://repo.maven.apache.org/maven2/",
          "defaultStatus": "unaffected",
          "packageName": "org.apache.opennlp:opennlp-tools",
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "lessThan": "2.5.9",
              "status": "affected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThan": "3.0.0-M3",
              "status": "affected",
              "version": "3.0",
              "versionType": "semver"
            }
          ]
        }
      ],
      "credits": [
        {
          "lang": "en",
          "type": "finder",
          "value": "Subramanian S"
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "supportingMedia": [
            {
              "base64": false,
              "type": "text/html",
              "value": "\u003cp\u003e\u003cstrong\u003eXML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eVersions Affected:\u003c/strong\u003e before 2.5.9, before 3.0.0-M3\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDescription:\u003c/strong\u003e The \u003ccode\u003eDictionaryEntryPersistor\u003c/code\u003e class initializes a static \u003ccode\u003eSAXParserFactory\u003c/code\u003e at class-load time without enabling \u003ccode\u003eFEATURE_SECURE_PROCESSING\u003c/code\u003e or disabling DTD processing. When \u003ccode\u003ecreate(InputStream, EntryInserter)\u003c/code\u003e is invoked, the only feature set on the \u003ccode\u003eXMLReader\u003c/code\u003e is namespace support \u2014 external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via \u003ccode\u003efile://\u003c/code\u003e entity references or server-side request forgery via \u003ccode\u003ehttp://\u003c/code\u003e entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project\u0027s own \u003ccode\u003eXmlUtil.createSaxParser()\u003c/code\u003e helper, which correctly sets \u003ccode\u003eFEATURE_SECURE_PROCESSING\u003c/code\u003e and \u003ccode\u003edisallow-doctype-decl\u003c/code\u003e and is used by all other XML parsing paths in the codebase. The public \u003ccode\u003eDictionary(InputStream)\u003c/code\u003e constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMitigation:\u003c/strong\u003e 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the \u003ccode\u003eDictionary(InputStream)\u003c/code\u003e constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.\u003cbr\u003e\u003c/p\u003e"
            }
          ],
          "value": "XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor\n\n\nVersions Affected: before 2.5.9, before 3.0.0-M3\n\n\nDescription: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support \u2014 external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project\u0027s own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario.\n\n\nMitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser."
        }
      ],
      "metrics": [
        {
          "other": {
            "content": {
              "text": "moderate"
            },
            "type": "Textual description of severity"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-611",
              "description": "CWE-611 Improper Restriction of XML External Entity Reference",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-04T16:55:55.834Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "vendor-advisory"
          ],
          "url": "https://lists.apache.org/thread/r6jpt0qr9nj67gqhppqg7jxf8vsbo0w6"
        }
      ],
      "source": {
        "discovery": "EXTERNAL"
      },
      "title": "Apache OpenNLP: XXE via Dictionary Parsing in DictionaryEntryPersistor",
      "x_generator": {
        "engine": "Vulnogram 0.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2026-40682",
    "datePublished": "2026-05-04T16:55:55.834Z",
    "dateReserved": "2026-04-14T17:21:09.189Z",
    "dateUpdated": "2026-05-05T15:02:14.483Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

CVE-2017-12620 (GCVE-0-2017-12620)

Vulnerability from nvd – Published: 2017-10-02 14:00 – Updated: 2024-09-16 19:15
VLAI
Summary
When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since Apache OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources. The versions 1.5.0 to 1.5.3, 1.6.0, 1.7.0 to 1.7.2, 1.8.0 to 1.8.1 of Apache OpenNLP are affected.
Severity
No CVSS data available.
CWE
  • Information Disclosure
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 1.5.0 to 1.5.3
Affected: 1.6.0
Affected: 1.7.0 to 1.7.2
Affected: 1.8.0 to 1.8.1
Create a notification for this product.
Date Public
2017-10-02 00:00
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2024-08-05T18:43:56.376Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "tags": [
              "x_refsource_CONFIRM",
              "x_transferred"
            ],
            "url": "http://opennlp.apache.org/news/cve-2017-12620.html"
          }
        ],
        "title": "CVE Program Container"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "status": "affected",
              "version": "1.5.0 to 1.5.3"
            },
            {
              "status": "affected",
              "version": "1.6.0"
            },
            {
              "status": "affected",
              "version": "1.7.0 to 1.7.2"
            },
            {
              "status": "affected",
              "version": "1.8.0 to 1.8.1"
            }
          ]
        }
      ],
      "datePublic": "2017-10-02T00:00:00.000Z",
      "descriptions": [
        {
          "lang": "en",
          "value": "When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since Apache OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources. The versions 1.5.0 to 1.5.3, 1.6.0, 1.7.0 to 1.7.2, 1.8.0 to 1.8.1 of Apache OpenNLP are affected."
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "description": "Information Disclosure",
              "lang": "en",
              "type": "text"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2017-10-02T13:57:02.000Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "http://opennlp.apache.org/news/cve-2017-12620.html"
        }
      ],
      "x_legacyV4Record": {
        "CVE_data_meta": {
          "ASSIGNER": "security@apache.org",
          "DATE_PUBLIC": "2017-10-02T00:00:00",
          "ID": "CVE-2017-12620",
          "STATE": "PUBLIC"
        },
        "affects": {
          "vendor": {
            "vendor_data": [
              {
                "product": {
                  "product_data": [
                    {
                      "product_name": "Apache OpenNLP",
                      "version": {
                        "version_data": [
                          {
                            "version_value": "1.5.0 to 1.5.3"
                          },
                          {
                            "version_value": "1.6.0"
                          },
                          {
                            "version_value": "1.7.0 to 1.7.2"
                          },
                          {
                            "version_value": "1.8.0 to 1.8.1"
                          }
                        ]
                      }
                    }
                  ]
                },
                "vendor_name": "Apache Software Foundation"
              }
            ]
          }
        },
        "data_format": "MITRE",
        "data_type": "CVE",
        "data_version": "4.0",
        "description": {
          "description_data": [
            {
              "lang": "eng",
              "value": "When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since Apache OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources. The versions 1.5.0 to 1.5.3, 1.6.0, 1.7.0 to 1.7.2, 1.8.0 to 1.8.1 of Apache OpenNLP are affected."
            }
          ]
        },
        "problemtype": {
          "problemtype_data": [
            {
              "description": [
                {
                  "lang": "eng",
                  "value": "Information Disclosure"
                }
              ]
            }
          ]
        },
        "references": {
          "reference_data": [
            {
              "name": "http://opennlp.apache.org/news/cve-2017-12620.html",
              "refsource": "CONFIRM",
              "url": "http://opennlp.apache.org/news/cve-2017-12620.html"
            }
          ]
        }
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2017-12620",
    "datePublished": "2017-10-02T14:00:00.000Z",
    "dateReserved": "2017-08-07T00:00:00.000Z",
    "dateUpdated": "2024-09-16T19:15:51.072Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}

CVE-2026-40682 (GCVE-0-2026-40682)

Vulnerability from cvelistv5 – Published: 2026-05-04 16:55 – Updated: 2026-05-05 15:02
VLAI
Title
Apache OpenNLP: XXE via Dictionary Parsing in DictionaryEntryPersistor
Summary
XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor Versions Affected: before 2.5.9, before 3.0.0-M3 Description: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario. Mitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.
Severity
No CVSS data available.
CWE
  • CWE-611 - Improper Restriction of XML External Entity Reference
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 0 , < 2.5.9 (semver)
Affected: 3.0 , < 3.0.0-M3 (semver)
Create a notification for this product.
Credits
Subramanian S
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2026-05-04T17:36:52.681Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "url": "http://www.openwall.com/lists/oss-security/2026/05/01/19"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "NONE",
              "baseScore": 9.1,
              "baseSeverity": "CRITICAL",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "HIGH",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N",
              "version": "3.1"
            }
          },
          {
            "other": {
              "content": {
                "id": "CVE-2026-40682",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "total"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-05T15:01:49.614474Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-05T15:02:14.483Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "collectionURL": "https://repo.maven.apache.org/maven2/",
          "defaultStatus": "unaffected",
          "packageName": "org.apache.opennlp:opennlp-tools",
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "lessThan": "2.5.9",
              "status": "affected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThan": "3.0.0-M3",
              "status": "affected",
              "version": "3.0",
              "versionType": "semver"
            }
          ]
        }
      ],
      "credits": [
        {
          "lang": "en",
          "type": "finder",
          "value": "Subramanian S"
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "supportingMedia": [
            {
              "base64": false,
              "type": "text/html",
              "value": "\u003cp\u003e\u003cstrong\u003eXML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eVersions Affected:\u003c/strong\u003e before 2.5.9, before 3.0.0-M3\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDescription:\u003c/strong\u003e The \u003ccode\u003eDictionaryEntryPersistor\u003c/code\u003e class initializes a static \u003ccode\u003eSAXParserFactory\u003c/code\u003e at class-load time without enabling \u003ccode\u003eFEATURE_SECURE_PROCESSING\u003c/code\u003e or disabling DTD processing. When \u003ccode\u003ecreate(InputStream, EntryInserter)\u003c/code\u003e is invoked, the only feature set on the \u003ccode\u003eXMLReader\u003c/code\u003e is namespace support \u2014 external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via \u003ccode\u003efile://\u003c/code\u003e entity references or server-side request forgery via \u003ccode\u003ehttp://\u003c/code\u003e entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project\u0027s own \u003ccode\u003eXmlUtil.createSaxParser()\u003c/code\u003e helper, which correctly sets \u003ccode\u003eFEATURE_SECURE_PROCESSING\u003c/code\u003e and \u003ccode\u003edisallow-doctype-decl\u003c/code\u003e and is used by all other XML parsing paths in the codebase. The public \u003ccode\u003eDictionary(InputStream)\u003c/code\u003e constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMitigation:\u003c/strong\u003e 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the \u003ccode\u003eDictionary(InputStream)\u003c/code\u003e constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.\u003cbr\u003e\u003c/p\u003e"
            }
          ],
          "value": "XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor\n\n\nVersions Affected: before 2.5.9, before 3.0.0-M3\n\n\nDescription: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support \u2014 external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project\u0027s own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario.\n\n\nMitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser."
        }
      ],
      "metrics": [
        {
          "other": {
            "content": {
              "text": "moderate"
            },
            "type": "Textual description of severity"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-611",
              "description": "CWE-611 Improper Restriction of XML External Entity Reference",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-04T16:55:55.834Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "vendor-advisory"
          ],
          "url": "https://lists.apache.org/thread/r6jpt0qr9nj67gqhppqg7jxf8vsbo0w6"
        }
      ],
      "source": {
        "discovery": "EXTERNAL"
      },
      "title": "Apache OpenNLP: XXE via Dictionary Parsing in DictionaryEntryPersistor",
      "x_generator": {
        "engine": "Vulnogram 0.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2026-40682",
    "datePublished": "2026-05-04T16:55:55.834Z",
    "dateReserved": "2026-04-14T17:21:09.189Z",
    "dateUpdated": "2026-05-05T15:02:14.483Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

CVE-2026-42027 (GCVE-0-2026-42027)

Vulnerability from cvelistv5 – Published: 2026-05-04 16:43 – Updated: 2026-05-05 16:02
VLAI
Title
Apache OpenNLP: Arbitrary Class Instantiation via Model Manifest in ExtensionLoader
Summary
Arbitrary Class Instantiation via Model Manifest in Apache OpenNLP ExtensionLoader Versions Affected: before 2.5.9, before 3.0.0-M3 Description:  The ExtensionLoader.instantiateExtension(Class, String) method loads a class by its fully-qualified name via Class.forName() and invokes its no-arg constructor, with the class name sourced from the manifest.properties entry of a model archive. The existing isAssignableFrom check correctly rejects classes that are not subtypes of the expected extension interface (BaseToolFactory for factory=, ArtifactSerializer for serializer-class-*), but the check runs after Class.forName() has already loaded and initialized the named class. Class.forName() with default initialization semantics executes the target class's static initializer before returning, so an attacker who can supply a crafted model archive can cause the static initializer of any class on the classpath to run during model loading, regardless of whether that class passes the subsequent type check. Exploitation requires a class with attacker-useful side effects in its static initializer (for example, JNDI lookup, outbound network I/O, or filesystem access) to be present on the classpath, so this is not a drop-in remote code execution; however, the attack surface grows as third-party model distribution becomes more common (community model repositories, Hugging Face-style sharing), where users routinely load model files from origins they do not control. A secondary, narrower vector affects deployments that ship legitimate BaseToolFactory or ArtifactSerializer subclasses with side-effecting no-arg constructors: a malicious manifest can name such a class and force its constructor to run during model load. Mitigation:  * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces a package-prefix allowlist that is consulted before Class.forName() is invoked, so the static initializer of a disallowed class is never executed. Classes under the opennlp. prefix remain permitted by default. Deployments that load models referencing factories or serializers outside opennlp.* must opt those packages in, either programmatically via ExtensionLoader.registerAllowedPackage(String) before the first model load, or by setting the OPENNLP_EXT_ALLOWED_PACKAGES system property to a comma-separated list of allowed package prefixes. Users who cannot upgrade immediately should ensure that all model files are sourced from trusted origins and should audit their classpath for classes with side-effecting static initializers or constructors, particularly any that perform JNDI lookups, network requests, or filesystem operations during class initialization.
Severity
No CVSS data available.
CWE
  • CWE-470 - Use of Externally-Controlled Input to Select Classes or Code ('Unsafe Reflection')
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 0 , < 2.5.9 (semver)
Affected: 3.0 , < 3.0.0-M3 (semver)
Create a notification for this product.
Credits
Subramanian S
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2026-05-04T17:36:56.492Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "url": "http://www.openwall.com/lists/oss-security/2026/05/01/20"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "HIGH",
              "baseScore": 9.8,
              "baseSeverity": "CRITICAL",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "HIGH",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
              "version": "3.1"
            }
          },
          {
            "other": {
              "content": {
                "id": "CVE-2026-42027",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "total"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-05T16:01:56.421468Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-05T16:02:56.683Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "collectionURL": "https://repo.maven.apache.org/maven2/",
          "defaultStatus": "unaffected",
          "packageName": "org.apache.opennlp:opennlp-tools",
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "lessThan": "2.5.9",
              "status": "affected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThan": "3.0.0-M3",
              "status": "affected",
              "version": "3.0",
              "versionType": "semver"
            }
          ]
        }
      ],
      "credits": [
        {
          "lang": "en",
          "type": "finder",
          "value": "Subramanian S"
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "supportingMedia": [
            {
              "base64": false,
              "type": "text/html",
              "value": "\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eArbitrary Class Instantiation via Model Manifest in Apache OpenNLP ExtensionLoader\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eVersions Affected: \u003c/b\u003ebefore 2.5.9, before 3.0.0-M3\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eDescription:\u003c/b\u003e\u0026nbsp;\u003c/p\u003e\u003cp\u003eThe \u003ccode\u003eExtensionLoader.instantiateExtension(Class, String)\u003c/code\u003e\u0026nbsp;method loads a class by its fully-qualified name via \u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;and invokes its no-arg constructor, with the class name sourced from the \u003ccode\u003emanifest.properties\u003c/code\u003e\u0026nbsp;entry of a model archive. The existing \u003ccode\u003eisAssignableFrom\u003c/code\u003e\u0026nbsp;check correctly rejects classes that are not subtypes of the expected extension interface (\u003ccode\u003eBaseToolFactory\u003c/code\u003e\u0026nbsp;for \u003ccode\u003efactory=\u003c/code\u003e, \u003ccode\u003eArtifactSerializer\u003c/code\u003e\u0026nbsp;for \u003ccode\u003eserializer-class-*\u003c/code\u003e), but the check runs \u003cem\u003e\u003cb\u003eafter\u003c/b\u003e\u003c/em\u003e\u0026nbsp;\u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;has already loaded and initialized the named class. \u003c/p\u003e\u003cp\u003e\u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;with default initialization semantics executes the target class\u0027s static initializer before returning, so an attacker who can supply a crafted model archive can cause the static initializer of any class on the classpath to run during model loading, regardless of whether that class passes the subsequent type check. \u003c/p\u003e\u003cp\u003eExploitation requires a class with attacker-useful side effects in its static initializer (for example, JNDI lookup, outbound network I/O, or filesystem access) to be \u003cb\u003epresent on the classpath\u003c/b\u003e, so this is not a drop-in remote code execution; however, the attack surface grows as third-party model distribution becomes more common (community model repositories, Hugging Face-style sharing), where users routinely load model files from origins they do not control. A secondary, narrower vector affects deployments that ship legitimate \u003ccode\u003eBaseToolFactory\u003c/code\u003e\u0026nbsp;or \u003ccode\u003eArtifactSerializer\u003c/code\u003e\u0026nbsp;subclasses with side-effecting no-arg constructors: a malicious manifest can name such a class and force its constructor to run during model load.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv\u003e\u003cdiv\u003e\u003cp\u003e\u003cb\u003eMitigation:\u003c/b\u003e\u0026nbsp;\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cul\u003e\u003cli\u003e2.x users should upgrade to 2.5.9. \u003c/li\u003e\u003cli\u003e3.x users should upgrade to 3.0.0-M3. \u003c/li\u003e\u003c/ul\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eNote: The fix introduces a package-prefix allowlist that is consulted before \u003ccode\u003eClass.forName()\u003c/code\u003e\u0026nbsp;is invoked, so the static initializer of a disallowed class is never executed. Classes under the \u003ccode\u003eopennlp.\u003c/code\u003e\u0026nbsp;prefix remain permitted by default. Deployments that load models referencing factories or serializers outside \u003ccode\u003eopennlp.*\u003c/code\u003e\u0026nbsp;must opt those packages in, either programmatically via \u003ccode\u003eExtensionLoader.registerAllowedPackage(String)\u003c/code\u003e\u0026nbsp;before the first model load, or by setting the \u003ccode\u003e\u003cb\u003eOPENNLP_EXT_ALLOWED_PACKAGES\u003c/b\u003e\u003c/code\u003e\u0026nbsp;system property to a comma-separated list of allowed package prefixes. \u003c/p\u003e\u003cp\u003eUsers who cannot upgrade immediately should ensure that all model files are sourced from \u003cb\u003etrusted origins\u003c/b\u003e\u0026nbsp;and should audit their classpath for classes with side-effecting static initializers or constructors, particularly any that perform JNDI lookups, network requests, or filesystem operations during class initialization.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cbr\u003e\u003cbr\u003e"
            }
          ],
          "value": "Arbitrary Class Instantiation via Model Manifest in Apache OpenNLP ExtensionLoader\n\n\n\n\n\nVersions Affected: before 2.5.9, before 3.0.0-M3\n\n\n\n\n\nDescription:\u00a0\n\nThe ExtensionLoader.instantiateExtension(Class, String)\u00a0method loads a class by its fully-qualified name via Class.forName()\u00a0and invokes its no-arg constructor, with the class name sourced from the manifest.properties\u00a0entry of a model archive. The existing isAssignableFrom\u00a0check correctly rejects classes that are not subtypes of the expected extension interface (BaseToolFactory\u00a0for factory=, ArtifactSerializer\u00a0for serializer-class-*), but the check runs after\u00a0Class.forName()\u00a0has already loaded and initialized the named class. \n\nClass.forName()\u00a0with default initialization semantics executes the target class\u0027s static initializer before returning, so an attacker who can supply a crafted model archive can cause the static initializer of any class on the classpath to run during model loading, regardless of whether that class passes the subsequent type check. \n\nExploitation requires a class with attacker-useful side effects in its static initializer (for example, JNDI lookup, outbound network I/O, or filesystem access) to be present on the classpath, so this is not a drop-in remote code execution; however, the attack surface grows as third-party model distribution becomes more common (community model repositories, Hugging Face-style sharing), where users routinely load model files from origins they do not control. A secondary, narrower vector affects deployments that ship legitimate BaseToolFactory\u00a0or ArtifactSerializer\u00a0subclasses with side-effecting no-arg constructors: a malicious manifest can name such a class and force its constructor to run during model load.\n\n\n\n\n\nMitigation:\u00a0\n\n\n\n  *  2.x users should upgrade to 2.5.9. \n  *  3.x users should upgrade to 3.0.0-M3. \n\n\n\n\nNote: The fix introduces a package-prefix allowlist that is consulted before Class.forName()\u00a0is invoked, so the static initializer of a disallowed class is never executed. Classes under the opennlp.\u00a0prefix remain permitted by default. Deployments that load models referencing factories or serializers outside opennlp.*\u00a0must opt those packages in, either programmatically via ExtensionLoader.registerAllowedPackage(String)\u00a0before the first model load, or by setting the OPENNLP_EXT_ALLOWED_PACKAGES\u00a0system property to a comma-separated list of allowed package prefixes. \n\nUsers who cannot upgrade immediately should ensure that all model files are sourced from trusted origins\u00a0and should audit their classpath for classes with side-effecting static initializers or constructors, particularly any that perform JNDI lookups, network requests, or filesystem operations during class initialization."
        }
      ],
      "metrics": [
        {
          "other": {
            "content": {
              "text": "moderate"
            },
            "type": "Textual description of severity"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-470",
              "description": "CWE-470 Use of Externally-Controlled Input to Select Classes or Code (\u0027Unsafe Reflection\u0027)",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-04T16:43:12.583Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "vendor-advisory"
          ],
          "url": "https://lists.apache.org/thread/ltlo4powjfc0w2w2yyl1o5tc7q1gcb2y"
        }
      ],
      "source": {
        "defect": [
          "OPENNLP-1820"
        ],
        "discovery": "UNKNOWN"
      },
      "title": "Apache OpenNLP: Arbitrary Class Instantiation via Model Manifest in ExtensionLoader",
      "x_generator": {
        "engine": "Vulnogram 0.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2026-42027",
    "datePublished": "2026-05-04T16:43:12.583Z",
    "dateReserved": "2026-04-23T14:21:25.317Z",
    "dateUpdated": "2026-05-05T16:02:56.683Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

CVE-2026-42440 (GCVE-0-2026-42440)

Vulnerability from cvelistv5 – Published: 2026-05-04 16:40 – Updated: 2026-05-05 16:03
VLAI
Title
Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader
Summary
OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader  Versions Affected:  before 2.5.9 before 3.0.0-M3  Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.   Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.
Severity
No CVSS data available.
CWE
  • CWE-789 - Memory Allocation with Excessive Size Value
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 0 , < 2.5.9 (semver)
Affected: 3.0 , < 3.0.0-M3 (semver)
Create a notification for this product.
Credits
Subramanian S
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2026-05-04T17:37:00.275Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "url": "http://www.openwall.com/lists/oss-security/2026/05/01/21"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "HIGH",
              "baseScore": 7.5,
              "baseSeverity": "HIGH",
              "confidentialityImpact": "NONE",
              "integrityImpact": "NONE",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
              "version": "3.1"
            }
          },
          {
            "other": {
              "content": {
                "id": "CVE-2026-42440",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-05T16:00:26.146388Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-05T16:03:03.237Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "collectionURL": "https://repo.maven.apache.org/maven2",
          "defaultStatus": "unaffected",
          "packageName": "org.apache.opennlp:opennlp-tools",
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "lessThan": "2.5.9",
              "status": "affected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThan": "3.0.0-M3",
              "status": "affected",
              "version": "3.0",
              "versionType": "semver"
            }
          ]
        }
      ],
      "credits": [
        {
          "lang": "en",
          "type": "finder",
          "value": "Subramanian S"
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "supportingMedia": [
            {
              "base64": false,
              "type": "text/html",
              "value": "\u003cp\u003e\u003cb\u003eOOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader\u0026nbsp;\u003c/b\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eVersions Affected:\u003c/b\u003e\u0026nbsp;\u003c/p\u003e\u003cp\u003ebefore 2.5.9\u003c/p\u003e\u003cp\u003ebefore 3.0.0-M3\u0026nbsp;\u003c/p\u003e\u003cp\u003e\u003cb\u003eDescription:\u003c/b\u003e\u003c/p\u003e\n\u003cp\u003eThe \u003ccode\u003eAbstractModelReader\u003c/code\u003e methods \u003ccode\u003egetOutcomes()\u003c/code\u003e, \u003ccode\u003egetOutcomePatterns()\u003c/code\u003e, and \u003ccode\u003egetPredicates()\u003c/code\u003e each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (\u003ccode\u003enew String[numOutcomes]\u003c/code\u003e, \u003ccode\u003enew int[numOCTypes][]\u003c/code\u003e, \u003ccode\u003enew String[NUM_PREDS]\u003c/code\u003e) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source.\u003c/p\u003e\n\u003cp\u003eA crafted \u003ccode\u003e.bin\u003c/code\u003e model file in which any of these count fields is set to \u003ccode\u003eInteger.MAX_VALUE\u003c/code\u003e (or any value large enough to exhaust the available heap) triggers an \u003ccode\u003eOutOfMemoryError\u003c/code\u003e at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, \u003ccode\u003egetOutcomes()\u003c/code\u003e is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a \u003ccode\u003e.bin\u003c/code\u003e model is affected, including direct use of \u003ccode\u003eGenericModelReader\u003c/code\u003e and any higher-level component that delegates to it during model load.\u003c/p\u003e\n\u003cp\u003eThe practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.\u0026nbsp;\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cb\u003eMitigation:\u003c/b\u003e\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e2.x users should upgrade to 2.5.9.\u003c/li\u003e\n\u003cli\u003e3.x users should upgrade to 3.0.0-M3.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e\u003cb\u003eNote:\u003c/b\u003e The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an \u003ccode\u003eIllegalArgumentException\u003c/code\u003e to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the \u003ccode\u003eOPENNLP_MAX_ENTRIES\u003c/code\u003e system property to the desired positive integer (e.g. \u003ccode\u003e-DOPENNLP_MAX_ENTRIES=50000000\u003c/code\u003e); invalid or non-positive values fall back to the default.\u003c/p\u003e\n\u003cp\u003eUsers who cannot upgrade immediately should treat all \u003ccode\u003e.bin\u003c/code\u003e model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.\u0026nbsp;\u003c/p\u003e"
            }
          ],
          "value": "OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader\u00a0\n\nVersions Affected:\u00a0\n\nbefore 2.5.9\n\nbefore 3.0.0-M3\u00a0\n\nDescription:\n\n\nThe AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source.\n\n\nA crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load.\n\n\nThe practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.\u00a0\u00a0\n\n\nMitigation:\n\n\n\n  *  2.x users should upgrade to 2.5.9.\n\n  *  3.x users should upgrade to 3.0.0-M3.\n\n\n\n\nNote: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default.\n\n\nUsers who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks."
        }
      ],
      "metrics": [
        {
          "other": {
            "content": {
              "text": "moderate"
            },
            "type": "Textual description of severity"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-789",
              "description": "CWE-789: Memory Allocation with Excessive Size Value",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-04T16:40:32.503Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "vendor-advisory"
          ],
          "url": "https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo"
        }
      ],
      "source": {
        "defect": [
          "OPENNLP-1821"
        ],
        "discovery": "UNKNOWN"
      },
      "title": "Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader",
      "x_generator": {
        "engine": "Vulnogram 0.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2026-42440",
    "datePublished": "2026-05-04T16:40:32.503Z",
    "dateReserved": "2026-04-27T12:43:14.347Z",
    "dateUpdated": "2026-05-05T16:03:03.237Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

CVE-2017-12620 (GCVE-0-2017-12620)

Vulnerability from cvelistv5 – Published: 2017-10-02 14:00 – Updated: 2024-09-16 19:15
VLAI
Summary
When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since Apache OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources. The versions 1.5.0 to 1.5.3, 1.6.0, 1.7.0 to 1.7.2, 1.8.0 to 1.8.1 of Apache OpenNLP are affected.
Severity
No CVSS data available.
CWE
  • Information Disclosure
Assigner
References
Impacted products
Vendor Product Version
Apache Software Foundation Apache OpenNLP Affected: 1.5.0 to 1.5.3
Affected: 1.6.0
Affected: 1.7.0 to 1.7.2
Affected: 1.8.0 to 1.8.1
Create a notification for this product.
Date Public
2017-10-02 00:00
Show details on NVD website

{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2024-08-05T18:43:56.376Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "tags": [
              "x_refsource_CONFIRM",
              "x_transferred"
            ],
            "url": "http://opennlp.apache.org/news/cve-2017-12620.html"
          }
        ],
        "title": "CVE Program Container"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "Apache OpenNLP",
          "vendor": "Apache Software Foundation",
          "versions": [
            {
              "status": "affected",
              "version": "1.5.0 to 1.5.3"
            },
            {
              "status": "affected",
              "version": "1.6.0"
            },
            {
              "status": "affected",
              "version": "1.7.0 to 1.7.2"
            },
            {
              "status": "affected",
              "version": "1.8.0 to 1.8.1"
            }
          ]
        }
      ],
      "datePublic": "2017-10-02T00:00:00.000Z",
      "descriptions": [
        {
          "lang": "en",
          "value": "When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since Apache OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources. The versions 1.5.0 to 1.5.3, 1.6.0, 1.7.0 to 1.7.2, 1.8.0 to 1.8.1 of Apache OpenNLP are affected."
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "description": "Information Disclosure",
              "lang": "en",
              "type": "text"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2017-10-02T13:57:02.000Z",
        "orgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
        "shortName": "apache"
      },
      "references": [
        {
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "http://opennlp.apache.org/news/cve-2017-12620.html"
        }
      ],
      "x_legacyV4Record": {
        "CVE_data_meta": {
          "ASSIGNER": "security@apache.org",
          "DATE_PUBLIC": "2017-10-02T00:00:00",
          "ID": "CVE-2017-12620",
          "STATE": "PUBLIC"
        },
        "affects": {
          "vendor": {
            "vendor_data": [
              {
                "product": {
                  "product_data": [
                    {
                      "product_name": "Apache OpenNLP",
                      "version": {
                        "version_data": [
                          {
                            "version_value": "1.5.0 to 1.5.3"
                          },
                          {
                            "version_value": "1.6.0"
                          },
                          {
                            "version_value": "1.7.0 to 1.7.2"
                          },
                          {
                            "version_value": "1.8.0 to 1.8.1"
                          }
                        ]
                      }
                    }
                  ]
                },
                "vendor_name": "Apache Software Foundation"
              }
            ]
          }
        },
        "data_format": "MITRE",
        "data_type": "CVE",
        "data_version": "4.0",
        "description": {
          "description_data": [
            {
              "lang": "eng",
              "value": "When loading models or dictionaries that contain XML it is possible to perform an XXE attack, since Apache OpenNLP is a library, this only affects applications that load models or dictionaries from untrusted sources. The versions 1.5.0 to 1.5.3, 1.6.0, 1.7.0 to 1.7.2, 1.8.0 to 1.8.1 of Apache OpenNLP are affected."
            }
          ]
        },
        "problemtype": {
          "problemtype_data": [
            {
              "description": [
                {
                  "lang": "eng",
                  "value": "Information Disclosure"
                }
              ]
            }
          ]
        },
        "references": {
          "reference_data": [
            {
              "name": "http://opennlp.apache.org/news/cve-2017-12620.html",
              "refsource": "CONFIRM",
              "url": "http://opennlp.apache.org/news/cve-2017-12620.html"
            }
          ]
        }
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "f0158376-9dc2-43b6-827c-5f631a4d8d09",
    "assignerShortName": "apache",
    "cveId": "CVE-2017-12620",
    "datePublished": "2017-10-02T14:00:00.000Z",
    "dateReserved": "2017-08-07T00:00:00.000Z",
    "dateUpdated": "2024-09-16T19:15:51.072Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}