Is there a way to get utf-8 out of a Unicode string?

thebjorn Sun, 29 Oct 2006 23:26:02 -0800

I've got a database (ms sqlserver) that's (way) out of my control,
where someone has stored utf-8 encoded Unicode data in regular varchar
fields, so that e.g. the string 'Blåbærsyltetøy' is in the database
as 'Bl\xc3\xa5b\xc3\xa6rsyltet\xc3\xb8y' :-/


Then I read the data out using adodbapi (which returns all strings as
Unicode) and I get u'Bl\xc3\xa5b\xc3\xa6rsyltet\xc3\xb8y'. I couldn't
find any way to get back to the original short of:

  def unfk(s):
      return eval(repr(s)[1:]).decode('utf-8')

i.e. chopping off the u in the repr of a unicode string, and relying on
eval to interpret the \xHH sequences.

Is there a less hack'ish way to do this?

-- bjorn

-- 
http://mail.python.org/mailman/listinfo/python-list

Is there a way to get utf-8 out of a Unicode string?

Reply via email to