> To: python-list@python.org > From: wlfr...@ix.netcom.com > Subject: Re: Checking Common File Types > Date: Sun, 1 Dec 2013 18:23:22 -0500 > > On Sun, 1 Dec 2013 18:27:16 +0000, jade <jadec...@msn.com> declaimed the > following: > > >Hello, > >I'm trying to create a script that checks all the files in my 'downloaded' > >directory against common file types and then tells me how many of the files > >in that directory aren't either a GIF or a JPG file. I'm familiar with basic > >Python but this is the first time I've attempted anything like this and I'm > >looking for a little help or a point in the right direction? > > > >file_sigs = {'\xFF\xD8\xFF':('JPEG','jpg'), '\x47\x49\x46':('GIF','gif')} > > Apparently you presume the file extensions are inaccurate, as you are > digging into the files for signatures. > > >def readFile(): filename = r'c:/temp/downloads' fh = open(filename, > >'r') file_sig = fh.read(4) print '[*] check_sig() File:',filename #, > >'Hash Sig:', binascii.hexlify(file_sig) > > Note: if you are hardcoding forward slashes, you don't need the raw > indicator... > > That said, what is "c:/temp/downloads"? You apparently are opening IT > as the file to be examined. Is it supposed to be a directory containing > many files, a file containing a list of files, ??? > > What is "check_sig" -- it looks like a function you haven't defined -- > but it's inside the quotes making a string literal that will never be > called anyway. > > If you are just concerned with one directory of files, you might want > to read the help file on the glob module, along with os.path > (join/splitext/etc). Or just string methods... > > >>> import glob > >>> import os.path > >>> TARGET = os.path.join(os.environ["USERPROFILE"], > ... "documents/BW-conversion/*") > >>> TARGET = os.path.join(os.environ["USERPROFILE"], > ... "documents/BW-conversion/*") > >>> files = glob.glob(TARGET) > >>> for fn in files: > ... fp, fx = os.path.splitext(fn) > ... print "File %s purports to be of type %s" % (fn, fx.upper()) > ... > File C:\Users\Wulfraed\documents/BW-conversion\BW-1.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\BW-2.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\BW-3.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\BW-4.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\BWConv.html purports to be > of type .HTML > File C:\Users\Wulfraed\documents/BW-conversion\roo_b1.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\roo_b2.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\roo_b3.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\roo_b4.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\roo_b5.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\roo_b6.jpg purports to be of > type .JPG > File C:\Users\Wulfraed\documents/BW-conversion\roo_col.jpg purports to be > of type .JPG > >>> > -- > Wulfraed Dennis Lee Bieber AF6VN > wlfr...@ix.netcom.com HTTP://wlfraed.home.netcom.com/ > > -- > https://mail.python.org/mailman/listinfo/python-list
Hi, thanks for all your replies. I realised pretty soon after I asked for help that I was trying to read the wrong amount of bytes and set about completely rewriting my code (after a coffee break) import sys, os, binascii def readfile(): dictionary = {'474946':('GIF', 'gif'), 'ffd8ff':('JPEG', 'jpeg')} try: files = os.listdir('C:\\Temp\\downloads') for item in files: f = open('C:\\Temp\\downloads\\'+ item, 'r') file_sig = f.read(3) file_sig_hex = binascii.hexlify(file_sig) if file_sig_hex in dictionary: print item + ' is a image file, it is a ' + file_sig else: print item + ' is not an image file, it is' +file_sig print file_sig_hex except: print 'Error. Try again' finally: if 'f' in locals(): f.close() def main(): readfile() if __name__ == '__main__': main() As of right now my script prints out 'Error Try again' but when i comment out this part of the code; if file_sig_hex in dictionary: print item + ' is a image file' + dictionary else: print item + ' is not an image file, is it' +dictionary it prints the file signatures to the screen, however what I'm trying to do with the if statement is tell me if the file is an image and give me is signature and if it is not, I want it to tell me and still give me it's signature and tell me what type of file it is. Can anyone point out an obvious error? RegardsJade
-- https://mail.python.org/mailman/listinfo/python-list