On 02/14/2019 12:02 PM, vergos.niko...@gmail.com wrote: > Τη Πέμπτη, 14 Φεβρουαρίου 2019 - 8:16:40 μ.μ. UTC+2, ο χρήστης Calvin > Spealman έγραψε: >> If you see something like this >> >> '\xce\x86\xce\xba\xce\xb7\xcf\x82 >> \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82' >> >> then you don't have a string, you have raw bytes. You don't "encode" bytes, >> you decode them. If you know this is already encoded as UTF-8 then you just >> need the decode('utf8') part and *not* the encode('latin1') step. >> >> encode() is something that turns text into bytes >> decode() is something that turns bytes into text >> >> So, if you already have bytes and you need text, you should only want to be >> doing a decode() and you just need to specific the correct encoding. > > I Agree but I don't know in what encoding the string is encoded into. > > I just tried > > names = tuple( [s.decode('utf8') for s in names] ) > > but i get the error of: > > AttributeError("'str' object has no attribute 'decode'",)
Strictly speaking, that's correct. A Python 3 string object is already decoded unicode. It cannot be decoded again. > but why it says s is a string object? Since we have names in raw bytes is > should be a bytes object? It's clearly not raw bytes. > How can i turn names from raw bytes to utf-8 strings? They apparently aren't raw bytes. If they were, you could use .decode() > ps. Who encoded them in raw bytes anyways? Since they fetced directly from > the database shouldn't > python3 have them stored in names as utf-8 strings? why raw bytes instead? Something very strange is going on with your database and/or your queries. The pymysql api should be already decoding the utf-8 bytes for you and returning a unicode string. I have no idea why you're getting a unicode string that consists of code points that are the same as the utf-8 bytes. You'll have to post a little bit more of your code, like a simple, complete query example (a few lines of code) that shows absolutely everything you're trying to do to the string. Also you will want to use the mysql command-line utilities to try your queries and see what kind of data you're getting out. Because if mysql is told to use utf-8 for varchar, and if you're inserting the data using correctly-formed utf-8 encoded byte strings, it should come back out in Python as unicode. -- https://mail.python.org/mailman/listinfo/python-list