> On 7 Jan 2017, at 02:18, Tristan Seligmann <mithra...@mithrandi.net> wrote:
> 
> On Sat, 7 Jan 2017 at 03:23 Glyph Lefkowitz <gl...@twistedmatrix.com> wrote:
> 
> Maybe we should support unicode for the body as well.  We can set the charset 
> in the mime-type and everything so that it will be properly intelligible by 
> the server, which doesn't happen if the user manually encodes like this.
> 
> Oh, forgot to comment on this point; in the specific case of JSON, it isn't 
> necessary to specify UTF-8 in Content-Type[1], but for HTML or XML it's a 
> pretty good idea. However, I'm not sure if it's possible to modify 
> Content-Type in a generic fashion to make this sort of thing work; for 
> example, "Content-Type: application/octet-stream; charset=UTF-8" is nonsense. 
> I'll defer to some HTTP experts here ;)

This is really not simple, for the reason that many MIME types do not define a 
charset extension. In the case of JSON, it’s not just not necessary to specify 
UTF-8 in Content-Type, but the standard explicitly does not define charset for 
the JSON content type[0]:

> Note:  No "charset" parameter is defined for this registration. Adding one 
> really has no effect on compliant recipients.

Strictly a completely compliant implementation would not emit charset details 
for content types that have no charset registration. Such a thing is pretty 
tricky to do. Knowing that, it’s probably best to YOLO your way though, or 
forbid unicode in bodies.

Cory

[0]: https://tools.ietf.org/html/rfc7159#section-11
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to