On Wednesday, June 13, 2012 5:11:49 PM UTC-5, Chris Price wrote: > > [...] Due to limitations in Puppet's representation of strings (character > encoding is not explicitly specified), it's not possible for us to do > anything too fancy when we encounter a byte sequence that is not directly > representable in UTF-8. >
Is Puppet's representation of strings distinct from Ruby's representation? In any case, it seems like a fundamental problem that Puppet is working with a bunch of strings whose encoding is uncertain. Why can't that be tackled farther upstream with a mechanism for ensuring that Puppet uses a consistent and known encoding for strings? Or even that it uses UTF-8 internally, so that no transcoding is needed when sending data to puppetdb? Furthermore, what do you mean by "a byte sequence that is not directly representable in UTF-8"? UTF-8 encodes characters as bytes, not bytes as bytes. No byte sequence is inherently non-representable. For example, you can encode any byte sequence in UTF-8 by assuming that it represents a sequence of Latin1-encoded characters, so that the bytes are also the characters' Unicode scalar values. Do you perhaps mean "a byte sequence that isn't already valid UTF-8"? I understand that Ruby 1.8 has pretty dismal character encoding support, but there are ways to deal with it. Surely you can do better than just an improved warning and a "don't do that". At least there is a potential for some user guidance. For example, would the problem be adequately addressed if all manifests and data were encoded in UTF-8 and the agent were ensured to run in a UTF-8-based locale? John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/Ww0zpDq8QdcJ. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.