On Dec 13, 12:58 am, Neil Cerutti <[EMAIL PROTECTED]> wrote: > On 2007-12-12, John Machin <[EMAIL PROTECTED]> wrote: > > >> It's clear that I am thinking to completely different usages > >> for CSV than what most people in this thread. I use csv to > >> export and import numerical data columns to and from > >> spreadsheets. > > > For that purpose, CSV files are the utter pox and then some. > > Consider using xlrd and xlwt (nee pyexcelerator) to read (resp. > > write) XLS files directly. > > I can vouch for that advice. I was exporting .xls files to csv > text files for over a year before I tried the xlrd solution--the > whole process is less cumbersome now, though it was bewildering > at first working with Excel in Python. Actually, surprises still > crop up now and then, mostly to do with cell types.
Hi Neil, I'd be interested in hearing from you what caused the initial bewilderment with xlrd, and could it have been reduced by better documentation? What kinds of surprises? > The advantage > of working with csv was that everything was a string. It depends of your point of view. I'd regard that as a DISadvantage :-) With xlrd, if you have no expectation about the type of data in a cell, but need/want to know, xlrd will tell you. If you do have an expectation, you can check if actual == expected. Here's an example. Create a tiny csv file with dates in a format that's NOT appropriate to your locale (e.g. if you are in the USA, like the ddmmyyyy_dates.csv below). Open it with Excel by double- clicking on the name in Windows Explorer. Make the column twice its initial width. You'll notice some of the data (about 60% in a large dataset with approx. uniform distribution e.g. birth-dates) is LEFT- justified (text) and the remainder is RIGHT-justified (date). If you were given the xls file in that state, using xlrd the problem could be detected and worked around. Alternatively, go into user emulation mode: "fix" the problem with some formulas, forget immediately what you did, save the result as a new csv file, pass that on to the next user without a murmur, and delete the original csv file and the xls file. This is based on a true story, with one difference: the dates in the original csv file were formatted correctly for the DD/MM/YYYY-using locale; however Excel 97 would ignore the locale and assume MM/DD/YYYY if you opened the file from within Excel instead of double-clicking in Explorer (or vice versa; I forget which). 8<== ddmmyyyy_dates.csv 01/01/2007 31/01/2007 01/12/2007 31/12/2007 8<== mmddyyyy_dates.csv 01/01/2007 01/31/2007 12/01/2007 12/31/2007 8<== Cheers, John -- http://mail.python.org/mailman/listinfo/python-list