I'm working with a MS SQL Server database created by a program from a fine US company who seems to have gotten run over by the Unicode truck. In their infinite wisdom they've decided to store Unicode data directly in regular varchar fields, utf-8 encoded! (on the bright side, it is properly utf-8 encoded). One of our customers then wants to use a csv file created from a report to import in Excel and is getting an attitude when the text shows up "all garbled" (which I can understand...)
One method that works is to use Python to pull down the result set from the database, accumulate the entire result text as a big unicode string (while decode('utf-8') all text fields in the process) separating each field with a tab, before encode('utf-16') the result string and writing it to a file opened in binary mode. This ensures that the file gets a bom, that it's in a format (utf-16) that Excel can import, and hopefully tabs are less common than commas in the source data :-( The csv module doesn't support Unicode. The customer is of the firm belief that our national characters (æøå) are part of ascii, presumably because they're single-byte-encoded in iso-8859-1. He has no understanding for the issues (either by choice or experience) so there is no purpose to trying to explain the differences... Be that as it may, he might be satisfied with a csv file in that (iso-8859-1) encoding since the local version of Excel can import it transparently (with significant behind-the-scenes magic I believe...?) The Python script mentioned above has to be run on the server, since it doesn't accept remote connections, I'm of course the only one with access, and I'd like to remove myself from the loop. I've looked at creating a view on the database that would cast or convert the data, but all I've run into are vague references to StrConv, which seems to be a VB function. Giving the customer a macro that he could run in Excel after importing the data would probably be ok as well, so I also tried creating an Excel VB macro using the StrConv function, but (a) it isn't entirely clear to me that this function can do this, and (b) the third argument to the function is an LCID, a Locale ID, which is numeric and not defined anywhere I can find it... Anyone have any advice? tia, -- bjorn -- http://mail.python.org/mailman/listinfo/python-list