Re: Reading Windows CSV file with LCID entries under Linux.

Tim Golden Mon, 22 Sep 2008 09:00:16 -0700

Thomas Troeger wrote:

I've stumbled over a problem with Windows Locale ID information andcodepages. I'm writing a Python application that parses a CSV file,the format of a line in this file is "LCID;Text1;Text2". Each line cancontain a different locale id (LCID) and the text fields contain datathat is encoded in some codepage which is associated with this LCID. Mycurrent data file contains the codes 1033 for German and 1031 forEnglish US (as listed inhttp://www.microsoft.com/globaldev/reference/lcid-all.mspx).Unfortunately, I cannot find out which Codepage (like cp-1252 orwhatever) belongs to which LCID.
My question is: How can I convert this data into something morereasonable like unicode? Basically, what I want is something like"Text1;Text2", both fields encoded as UTF-8. Can this be done withPython? How can I find out which codepage I have to use for 1033 and 1031?



The GetLocaleInfo API call can do that conversion:

http://msdn.microsoft.com/en-us/library/ms776270(VS.85).aspx

You'll need to use ctypes (or write a c extension) to
use it. Be aware that if it doesn't succeed you may need
to fall back on cp 65001 -- utf8.

TJG
--
http://mail.python.org/mailman/listinfo/python-list

Re: Reading Windows CSV file with LCID entries under Linux.

Reply via email to