Description
NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.
Published: 2026-06-22
Score: 7.5 High
EPSS: < 1% Very Low
KEV: No
Impact: n/a
Action: n/a
AI Analysis

Analysis and contextual insights are available on OpenCVE Cloud.

Remediation

No vendor fix or workaround currently provided.

Additional remediation guidance may be available on OpenCVE Cloud.

Tracking

Sign in to view the affected projects.

Advisories
Source ID Title
Github GHSA Github GHSA GHSA-p4gq-832x-fm9v Natural Language Toolkit (NLTK): URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read
History

Fri, 26 Jun 2026 00:15:00 +0000

Type Values Removed Values Added
References
Metrics threat_severity

None

threat_severity

Important


Tue, 23 Jun 2026 02:30:00 +0000

Type Values Removed Values Added
Metrics ssvc

{'options': {'Automatable': 'yes', 'Exploitation': 'poc', 'Technical Impact': 'partial'}, 'version': '2.0.3'}


Mon, 22 Jun 2026 21:30:00 +0000

Type Values Removed Values Added
First Time appeared Nltk
Nltk nltk
Vendors & Products Nltk
Nltk nltk

Mon, 22 Jun 2026 18:45:00 +0000

Type Values Removed Values Added
Description NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.
Title NLTK: URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read
Weaknesses CWE-22
References
Metrics cvssV3_1

{'score': 7.5, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N'}


cve-icon MITRE

Status: PUBLISHED

Assigner: GitHub_M

Published:

Updated: 2026-06-30T12:05:59.360Z

Reserved: 2026-06-12T17:46:37.293Z

Link: CVE-2026-54293

cve-icon Vulnrichment

Updated: 2026-06-30T03:17:11.282Z

cve-icon NVD

No data.

cve-icon Redhat

Severity : Important

Publid Date: 2026-06-22T17:25:05Z

Links: CVE-2026-54293 - Bugzilla

cve-icon OpenCVE Enrichment

Updated: 2026-06-22T21:15:04Z

Weaknesses