Re: Unicode/utf-8 data in SQL Server

John Machin Wed, 09 Aug 2006 02:25:50 -0700

Laurent Pointal wrote:
> John Machin a écrit :
> > The customer should be very happy if you do
> > text.decode('utf-8').encode('cp1252') -- not only should the file
> > import into Excel OK, he should be able to view it in
> > Word/Notepad/whatever.
>
> +
> text.decode('utf-8').encode('cp1252',errors='replace')
>
> As cp1252 may not cover all utf8 chars.


In that case, the OP may well want to use 'xmlcharrefreplace' or
'backslashreplace' as they stand out more than '?' *and* the original
Unicode is recoverable if necessary e.g.:

#>>> msg = u'\u0124\u0114\u0139\u013B\u0150'
>>> print msg
HELLO
#>>> msg.encode('cp1252', 'replace')
'?????'
#>>> msg.encode('cp1252', 'xmlcharrefreplace')
'&#292;&#276;&#313;&#315;&#336;'
#>>> msg.encode('cp1252', 'backslashreplace')
'\\u0124\\u0114\\u0139\\u013b\\u0150'
#>>> 

Cheers,
John

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Unicode/utf-8 data in SQL Server

Reply via email to