> On 7 Jan 2017, at 02:18, Tristan Seligmann <mithra...@mithrandi.net> wrote: > > On Sat, 7 Jan 2017 at 03:23 Glyph Lefkowitz <gl...@twistedmatrix.com> wrote: > > Maybe we should support unicode for the body as well. We can set the charset > in the mime-type and everything so that it will be properly intelligible by > the server, which doesn't happen if the user manually encodes like this. > > Oh, forgot to comment on this point; in the specific case of JSON, it isn't > necessary to specify UTF-8 in Content-Type[1], but for HTML or XML it's a > pretty good idea. However, I'm not sure if it's possible to modify > Content-Type in a generic fashion to make this sort of thing work; for > example, "Content-Type: application/octet-stream; charset=UTF-8" is nonsense. > I'll defer to some HTTP experts here ;)
This is really not simple, for the reason that many MIME types do not define a charset extension. In the case of JSON, it’s not just not necessary to specify UTF-8 in Content-Type, but the standard explicitly does not define charset for the JSON content type[0]: > Note: No "charset" parameter is defined for this registration. Adding one > really has no effect on compliant recipients. Strictly a completely compliant implementation would not emit charset details for content types that have no charset registration. Such a thing is pretty tricky to do. Knowing that, it’s probably best to YOLO your way though, or forbid unicode in bodies. Cory [0]: https://tools.ietf.org/html/rfc7159#section-11 _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python