On 12-09-2010 00:07, Robert Kern wrote: > On 9/11/10 4:45 PM, Stef Mientki wrote: >> On 11-09-2010 21:11, Robert Kern wrote: >>> SQLite internally stores its strings as UTF-8 or UTF-16 encoded Unicode. So >>> it's not clear what >>> you mean when you say the database is "windows-1252". Can you be more >>> specific? >> I doubt that, but I'm not sure ... > > From the documentation, it looks like SQLite does not attempt to validate the > input as UTF-8 > encoded, so it is possible that someone pushed in raw bytes. See "Support for > UTF-8 and UTF-16" in > the following page: > > http://www.sqlite.org/version3.html > >> For some databases written by other programs and >> written with Python, with >> cursor = self.conn.cursor () >> self.conn.text_factory = str >> >> Can only be read back with with text_factory = str >> then the resulting string columns contains normal strings with windows 1252 >> coding, like >> character 0xC3 > > You can probably use > > self.conn.text_factory = lambda x: x.decode('windows-1252') > > to read the data, though I've never tried to use that API myself. > > You will need to write a program yourself that opens one connection to your > existing database for > reading and another connection to another database (using the defaults) for > writing. Then iterate > over your tables and copy data from one database to the other. > > You may also be able to simply dump the database to a text file using > "sqlite3 bad-database.db > .dump > bad-sql.sql", read the text file into Python as a string, decode it > from windows-1252 to > unicode and then encode it as utf-8 and write it back out. Then use "sqlite3 > good-database.db > .read good-sql.sql" to create the new database. I've never tried such a > thing, so it may not work. > Yes, I think I've to do somethhing like that, to conserve the structure and field types, it's even more complex.
cheers, Stef -- http://mail.python.org/mailman/listinfo/python-list