On 11 July 2012 19:15, <subhabangal...@gmail.com> wrote: > On Tuesday, July 10, 2012 11:16:08 PM UTC+5:30, Subhabrata wrote: > > Dear Group, > > > > I kept a good number of files in a folder. Now I want to read all of > > them. They are in different formats and different encoding. Using > > listdir/glob.glob I am able to find the list but how to open/read or > > process them for different encodings? > > > > If any one can help me out.I am using Python3.2 on Windows. > > > > Regards, > > Subhabrata Banerjee. > Dear Group, > > No generally I know the glob.glob or the encodings as I work lot on > non-ASCII stuff, but I recently found an interesting issue, suppose there > are .doc,.docx,.txt,.xls,.pdf files with different encodings.
Some of the formats you have listed are not text-based. What do you mean by the encoding of e.g. a .doc or .xls file? My understanding is that these are binary files. You won't be able to read them without the help of a special module (I don't know of one that can). > 1) First I have to determine on the fly the file type. > 2) I can not assign encoding="..." whatever be the encoding I have to read > it. > Perhaps you just want to open the file as binary? The following will read the contents of any file binary or text regardless of encoding or anything else: f = open('spreadsheet.xls', 'rb') data = f.read() # returns binary data rather than text > > Any idea. Thinking. > > Thanks in Advance, > Regards, > Subhabrata Banerjee. > > -- > http://mail.python.org/mailman/listinfo/python-list >
-- http://mail.python.org/mailman/listinfo/python-list