@ALL: Isn't it possible and wise to include an (optional) encoder in pgsql?
we're importing a lot of data from textfiles, which are not utf-8. we always have to change the encoding in another tool before using COPY. 2011/2/28 Craig Ringer <cr...@postnewspapers.com.au> > On 27/02/11 20:47, AI Rumman wrote: > > I am getting error in Postgresql 9.0.1. > > > > update import_details_test > > set data_row = '["4","1 Monor JoÃ\u083ão S. AntÃ\u0083ão > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Because your email client may have transformed the text encoding, I > can't make any certain conclusions about what you're actually sending to > the database, but it's highly likely that you're sending latin-1 encoded > text to the database while your client_encoding is set to 'utf8'. > > The marked text is most likely the problem... but I think there's more > wrong with it than just being latin-1 encoded. That kind of mangling > often comes about when utf-8 text has been incorrectly interpreted as > latin-1 and modified, or when something has incorrectly tried to do > utf8<->latin-1 conversions more than once. You really need to figure out > what encoding your input is in, convert it to a known encoding like > utf-8 *once*, and keep it that way. > > If you're using Python, which I suspect you might be, the "".decode() > function is useful. For example, I can convert a latin-1 encoded byte > string to a python Unicode string with: > > "somelatin1string".decode("latin-1") > > Sometimes you can get away with just "SET client_encoding=latin-1" but > in this case your string data looks like it's been mangled by more than > just a single encoding mis-interpretation, so you'll probably just > silently insert corrupt data by doing that. Don't. Fix your code so it > knows what the text encoding of the input is. > > If you are, in fact, using Python, it's a really good idea to always > "".decode() all your inputs so your internal processing is done in > Unicode (UTF-16, in fact). Similarly, Qt programmers should convert > everything to unicode QString as soon as possible and use that for all > internal manipulation. It'll save a lot of pain. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >