On 24 Feb 2012, at 15:50, Bill Moseley wrote:

> When using Catalyst::Action::REST the content-type response never includes a 
> charset.  JSON seems to be handled correctly in code -- JSON strings are 
> always UTF-8.  Does that mean there is no need to specify a charset on 
> responses?


Theoretically, you don't need to, but I think we should.. Specifically I've 
heard reported encoding issues talking to some other stacks which were fixed by 
us doing this explicitly.

> And what if a JSON request comes in with a non-UTF8 charset?  Should that be 
> ignored?  It's application/json, not text/json so maybe there no encoding 
> issues?

I thought that JSON was always UTF-8, but I read the spec recently, and whilst 
it's always unicode, it can be encoded as utf-others also:

   JSON text SHALL be encoded in Unicode.  The default encoding is UTF-8.

   Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.


> What about other serializations?  YAML is UTF-8 or UTF-16.  Does that mean 
> the charset needs to be included in response?  And again, if a request comes 
> in with UTF-16 does it need to be decoded or does that happen in YAML::Syck?
> 

The latter, but yes I think the charset should also be included.

> Event text/html doesn't include a charset in a the "serialized" response.

I would think that text/html should be handled by C::P::Unicode::Encoding 
still, if that was present?

> Does there need to be an additional decoding and encoding layer when using 
> Catalyst::Action::REST?  

I'm of the opinion there shouldn't need to be.

> Should I force a charset on all responses

I think we should fix this, at least for JSON and YAML where the right thing to 
do is entirely clear..

> BTW -- doesn't seem like YAML survies a round trip like JSON does:
> <snip>
> But YAML drops the utf8 flag:
> 
> $ perl -MYAML::Syck  -MEncode -wle 'print 
> length(YAML::Syck::Load(YAML::Syck::Dump( ["\x{263A}"]) )->[0])'
> 3

Eugh. This works as expected with YAML and YAML::XS, I vote that we should stop 
using YAML::Syck as it's less maintained (and clearly has encoding issues).

Anyone have strong reasons for not doing this?

Cheers
t0m


_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[email protected]/
Dev site: http://dev.catalyst.perl.org/

Reply via email to