Description
NLTK version 3.9.4 is vulnerable to a path traversal attack due to an incomplete fix for GitHub Issue #3504. The `_UNSAFE_NO_PROTOCOL_RE` regex in `nltk/data.py` checks for literal `../` sequences but fails to account for percent-encoded traversal sequences such as `..%2f`. The `url2pathname()` function decodes these sequences after the validation step, allowing an attacker to bypass the protection. This vulnerability enables an attacker to read arbitrary files accessible to the Python process by controlling the resource name parameter passed to `nltk.data.load()` or `nltk.data.find()`. The issue affects applications that rely on NLTK for resource loading, including NLP web applications, Jupyter notebooks, and CLI tools. The default `pathsec.ENFORCE=False` setting exacerbates the impact by not blocking the file read at the `open()` stage.
Published: 2026-06-30
Score: 7.5 High
EPSS: < 1% Very Low
KEV: No
Impact: n/a
Action: n/a
AI Analysis

Analysis and contextual insights are available on OpenCVE Cloud.

Remediation

No vendor fix or workaround currently provided.

Additional remediation guidance may be available on OpenCVE Cloud.

Tracking

Sign in to view the affected projects.

Advisories

No advisories yet.

History

Tue, 30 Jun 2026 14:30:00 +0000

Type Values Removed Values Added
Metrics ssvc

{'options': {'Automatable': 'yes', 'Exploitation': 'poc', 'Technical Impact': 'partial'}, 'version': '2.0.3'}


Tue, 30 Jun 2026 05:15:00 +0000

Type Values Removed Values Added
First Time appeared Nltk
Nltk nltk/nltk
Vendors & Products Nltk
Nltk nltk/nltk

Tue, 30 Jun 2026 01:15:00 +0000

Type Values Removed Values Added
Description NLTK version 3.9.4 is vulnerable to a path traversal attack due to an incomplete fix for GitHub Issue #3504. The `_UNSAFE_NO_PROTOCOL_RE` regex in `nltk/data.py` checks for literal `../` sequences but fails to account for percent-encoded traversal sequences such as `..%2f`. The `url2pathname()` function decodes these sequences after the validation step, allowing an attacker to bypass the protection. This vulnerability enables an attacker to read arbitrary files accessible to the Python process by controlling the resource name parameter passed to `nltk.data.load()` or `nltk.data.find()`. The issue affects applications that rely on NLTK for resource loading, including NLP web applications, Jupyter notebooks, and CLI tools. The default `pathsec.ENFORCE=False` setting exacerbates the impact by not blocking the file read at the `open()` stage.
Title Path Traversal via Percent-Encoding in nltk.data.find() and nltk.data.load()
Weaknesses CWE-22
References
Metrics cvssV3_0

{'score': 7.5, 'vector': 'CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N'}


cve-icon MITRE

Status: PUBLISHED

Assigner: @huntr_ai

Published:

Updated: 2026-06-30T13:14:15.150Z

Reserved: 2026-06-15T06:24:30.096Z

Link: CVE-2026-12243

cve-icon Vulnrichment

Updated: 2026-06-30T13:13:52.380Z

cve-icon NVD

No data.

cve-icon Redhat

No data.

cve-icon OpenCVE Enrichment

Updated: 2026-06-30T05:00:03Z

Weaknesses