Hi all, I had a filesystem crash and when I retrieved the data back the files had random names without extension. I decided to write a script to determine the file extension and create a newfile with extension. --- method 1: # File extension utility.
import os import mimetypes import shutil def main(): for root,dirs,files in os.walk(r'C:\Senthil\test'): for each in files: fname = os.path.join(root,each) print fname mtype,entype = mimetypes.guess_type(fname) fext = mimetypes.guess_extension(mtype) if fext is not None: try: newname = fname + fext print newname shutil.copyfile(fname,newname) except (IOError,os.error), why: print "Can't copy %s to %s: %s" % (fname,newname,str(why)) if __name__ == "__main__": main() ---- The problem I faced with this script is. if the filename did not have any extension, the mimetypes.guess_type(filename) failed!!! How do I get around this problem. As it was a linux box, I tried using file command to get the work done. ---- Method 2: import os import shutil import re def detext(filename): cin,cout,cerr = os.popen3('file ' + filename) fileoutput = cout.read() rtf = re.compile('Rich Text Format data') # doc = re.compile('Microsoft Office Document') pdf = re.compile('PDF') if rtf.search(fileoutput) is not None: shutil.copyfile(filename,filename + '.rtf') if doc.search(fileoutput) is not None: shutil.copyfile(filename,filename + '.doc') if pdf.search(fileoutput) is not None: shutil.copyfile(filename,filename + '.pdf') def main(): for root,dirs,files in os.walk(os.getcwd()): for each in files: fname = os.path.join(root,each) detext(fname) if __name__ == '__main__': main() ---- but the problem with using file was it recognized both .xls (MS Excel) and .doc ( MS Doc) as Microsoft Word Document only. I need to separate the .xls and .doc files, I dont know if file will be helpful here. -- If the first approach of mimetypes works, it would be great! Has anyone faced this problem? How did you solve it? thanks, Senthil http://phoe6.livejournal.com -- http://mail.python.org/mailman/listinfo/python-list