On Tue, 01 Jan 2013 03:35:56 -0800, anilkumar.dannina wrote: > I am facing one issue in my module. I am gathering data from sql server > database. In the data that I got from db contains special characters > like "endash". Python was taking it as "\x96". I require the same > character(endash). How can I perform that. Can you please help me in > resolving this issue.
"endash" is not a character, it is six characters. On the other hand, "\x96" is a single byte: py> c = u"\x96" py> assert len(c) == 1 But it is not a legal Unicode character: py> import unicodedata py> unicodedata.name(c) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: no such name So if it is not a Unicode character, it is probably a byte. py> c = "\x96" py> print c � To convert byte 0x96 to an n-dash character, you need to identify the encoding to use. (Aside: and *stop* using it. It is 2013 now, anyone who is not using UTF-8 is doing it wrong. Legacy encodings are still necessary for legacy data, but any new data should always using UTF-8.) CP 1252 is one possible encoding, but there may be others: py> uc = c.decode('cp1252') py> unicodedata.name(uc) 'EN DASH' -- Steven -- http://mail.python.org/mailman/listinfo/python-list