#36991: LookupError crash (HTTP 500) in parse_header_parameters() when
Content-Type
header contains RFC 2231 parameter with invalid encoding name
-------------------------------+-----------------------------------------
Reporter: claok | Owner: Dinesh Thumma
Type: Bug | Status: assigned
Component: HTTP handling | Version: 5.1
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------+-----------------------------------------
Changes (by Jake Howard):
* needs_better_patch: 0 => 1
Old description:
> **Component:** HTTP handling
> **Type:** Bug
> **Version:** 5.1 (also affects 4.2, 5.2, main)
> **Severity:** Normal
> **Keywords:** parse_header_parameters, Content-Type, LookupError, urllib,
> unquote
>
> -------------------------
>
> **Description:**
>
> parse_header_parameters() in django/utils/http.py crashes with an
> unhandled LookupError when it receives a Content-Type header containing
> an RFC 2231 encoded parameter (e.g. charset*=) where the encoding portion
> is an invalid codec name. This causes Django's WSGI request
> initialization to raise an uncaught exception, resulting in HTTP 500
> instead of HTTP 400.
>
> **Security note:** This crash can be triggered by any unauthenticated
> request. The crash occurs inside WSGIRequest.__init__() during WSGI
> request construction — before Django processes the Authorization header,
> before authentication middleware runs, and before any view-level access
> control is evaluated. No valid credentials are required to trigger the
> 500 response.
>
> **Minimal reproduction:**
>
> Request:
>
> {{{
> GET /api/v1/ HTTP/2
> Host: host.com
> Content-Type: ;*=''%
> }}}
>
> {{{
> from django.utils.http import parse_header_parameters
> parse_header_parameters("text/plain; charset*=BOGUS''value")
> # → LookupError: unknown encoding: BOGUS
> }}}
>
> **Full traceback (from production, Python 3.13, Django 5.1.x):**
>
> {{{
> File "django/core/handlers/wsgi.py", line 73, in __init__
> self._set_content_type_params(environ)
> File "django/http/request.py", line 102, in _set_content_type_params
> self.content_type, self.content_params = parse_header_parameters(
> meta.get("CONTENT_TYPE", "")
> )
> File "django/utils/http.py", line 356, in parse_header_parameters
> value = unquote(value, encoding=encoding)
> File "urllib/parse.py", line 712, in unquote
> return ''.join(_generate_unquoted_parts(string, encoding, errors))
> File "urllib/parse.py", line 688, in _generate_unquoted_parts
> yield _unquote_impl(ascii_match[1]).decode(encoding, errors)
> LookupError: unknown encoding: <garbage value from Content-Type header>
> }}}
>
> **Root cause:**
>
> In parse_header_parameters(), when a parameter name ends with * and the
> value contains exactly 2 single quotes, Django treats it as an RFC 2231
> encoded parameter and extracts the encoding name from the value before
> passing it to urllib.parse.unquote():
>
> {{{
> if has_encoding:
> encoding, lang, value = value.split("'")
> value = unquote(value, encoding=encoding) # no validation of 'encoding'
> }}}
>
> If encoding is not a valid Python codec name, bytes.decode(encoding)
> inside urllib.parse.unquote() raises LookupError. This is not caught
> anywhere in the call stack. Since the crash happens inside
> WSGIRequest.__init__(), no Django middleware or DRF parser can intercept
> it.
>
> **Expected behavior:**
>
> Invalid encoding names in RFC 2231 Content-Type parameters should result
> in an HTTP 400 Bad Request, not an HTTP 500 Internal Server Error.
>
> **Proposed fix:**
>
> Wrap the unquote() call in a try/except (LookupError, UnicodeDecodeError)
> and raise ValueError (which callers already handle) or
> django.core.exceptions.BadRequest:
>
> {{{
> if has_encoding:
> encoding, lang, value = value.split("'")
> try:
> value = unquote(value, encoding=encoding)
> except (LookupError, UnicodeDecodeError):
> raise ValueError(f"Invalid encoding '{encoding}' in Content-Type
> parameter.")
> }}}
>
> **Notes:**
>
> - This code area was reviewed following ticket #35440 (security report,
> concluded non-security). The rewrite using email.Message was attempted
> and reverted in #36520 due to performance regression. Neither addressed
> this specific LookupError path.
> - urllib.parse.unquote() is behaving correctly — the bug is that Django
> passes an unvalidated, user-controlled encoding name to it.
> - Discoverable via API fuzzing tools (e.g. Mayhem4API).
>
> -------------------------
New description:
parse_header_parameters() in django/utils/http.py crashes with an
unhandled LookupError when it receives a Content-Type header containing an
RFC 2231 encoded parameter (e.g. charset*=) where the encoding portion is
an invalid codec name. This causes Django's WSGI request initialization to
raise an uncaught exception, resulting in HTTP 500 instead of HTTP 400.
**Security note:** This crash can be triggered by any unauthenticated
request. The crash occurs inside WSGIRequest.__init__() during WSGI
request construction — before Django processes the Authorization header,
before authentication middleware runs, and before any view-level access
control is evaluated. No valid credentials are required to trigger the 500
response.
**Minimal reproduction:**
Request:
{{{
GET /api/v1/ HTTP/2
Host: host.com
Content-Type: ;*=''%
}}}
{{{
from django.utils.http import parse_header_parameters
parse_header_parameters("text/plain; charset*=BOGUS''value")
# → LookupError: unknown encoding: BOGUS
}}}
**Full traceback (from production, Python 3.13, Django 5.1.x):**
{{{
File "django/core/handlers/wsgi.py", line 73, in __init__
self._set_content_type_params(environ)
File "django/http/request.py", line 102, in _set_content_type_params
self.content_type, self.content_params = parse_header_parameters(
meta.get("CONTENT_TYPE", "")
)
File "django/utils/http.py", line 356, in parse_header_parameters
value = unquote(value, encoding=encoding)
File "urllib/parse.py", line 712, in unquote
return ''.join(_generate_unquoted_parts(string, encoding, errors))
File "urllib/parse.py", line 688, in _generate_unquoted_parts
yield _unquote_impl(ascii_match[1]).decode(encoding, errors)
LookupError: unknown encoding: <garbage value from Content-Type header>
}}}
**Root cause:**
In parse_header_parameters(), when a parameter name ends with * and the
value contains exactly 2 single quotes, Django treats it as an RFC 2231
encoded parameter and extracts the encoding name from the value before
passing it to urllib.parse.unquote():
{{{
if has_encoding:
encoding, lang, value = value.split("'")
value = unquote(value, encoding=encoding) # no validation of 'encoding'
}}}
If encoding is not a valid Python codec name, bytes.decode(encoding)
inside urllib.parse.unquote() raises LookupError. This is not caught
anywhere in the call stack. Since the crash happens inside
WSGIRequest.__init__(), no Django middleware or DRF parser can intercept
it.
**Expected behavior:**
Invalid encoding names in RFC 2231 Content-Type parameters should result
in an HTTP 400 Bad Request, not an HTTP 500 Internal Server Error.
**Proposed fix:**
Wrap the unquote() call in a try/except (LookupError, UnicodeDecodeError)
and raise ValueError (which callers already handle) or
django.core.exceptions.BadRequest:
{{{
if has_encoding:
encoding, lang, value = value.split("'")
try:
value = unquote(value, encoding=encoding)
except (LookupError, UnicodeDecodeError):
raise ValueError(f"Invalid encoding '{encoding}' in Content-Type
parameter.")
}}}
**Notes:**
- This code area was reviewed following ticket #35440 (security report,
concluded non-security). The rewrite using email.Message was attempted and
reverted in #36520 due to performance regression. Neither addressed this
specific LookupError path.
- urllib.parse.unquote() is behaving correctly — the bug is that Django
passes an unvalidated, user-controlled encoding name to it.
- Discoverable via API fuzzing tools (e.g. Mayhem4API).
--
Comment:
[https://github.com/django/django/pull/20962 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/36991#comment:5>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/django-updates/0107019d1a05cc75-900c9056-ba74-416a-b599-774e7900aa9b-000000%40eu-central-1.amazonses.com.