> To: python-list@python.org
> From: wlfr...@ix.netcom.com
> Subject: Re: Checking Common File Types
> Date: Sun, 1 Dec 2013 18:23:22 -0500
> 
> On Sun, 1 Dec 2013 18:27:16 +0000, jade <jadec...@msn.com> declaimed the
> following:
> 
> >Hello, 
> >I'm trying to create a script that checks all the files in my 'downloaded' 
> >directory against common file types and then tells me how many of the files 
> >in that directory aren't either a GIF or a JPG file. I'm familiar with basic 
> >Python but this is the first time I've attempted anything like this and I'm 
> >looking for a little help or a point in the right direction? 
> >
> >file_sigs = {'\xFF\xD8\xFF':('JPEG','jpg'),  '\x47\x49\x46':('GIF','gif')}
> 
>       Apparently you presume the file extensions are inaccurate, as you are
> digging into the files for signatures.
> 
> >def readFile():    filename = r'c:/temp/downloads'      fh = open(filename, 
> >'r')     file_sig = fh.read(4) print '[*] check_sig() File:',filename #, 
> >'Hash Sig:', binascii.hexlify(file_sig) 
> 
>       Note: if you are hardcoding forward slashes, you don't need the raw
> indicator...
> 
>       That said, what is "c:/temp/downloads"? You apparently are opening IT
> as the file to be examined. Is it supposed to be a directory containing
> many files, a file containing a list of files, ???
> 
>       What is "check_sig" -- it looks like a function you haven't defined --
> but it's inside the quotes making a string literal that will never be
> called anyway.
> 
>       If you are just concerned with one directory of files, you might want
> to read the help file on the glob module, along with os.path
> (join/splitext/etc). Or just string methods...
> 
> >>> import glob
> >>> import os.path
> >>> TARGET = os.path.join(os.environ["USERPROFILE"],
> ...   "documents/BW-conversion/*")
> >>> TARGET = os.path.join(os.environ["USERPROFILE"],
> ...   "documents/BW-conversion/*")
> >>> files = glob.glob(TARGET)
> >>> for fn in files:
> ...   fp, fx = os.path.splitext(fn)
> ...   print "File %s purports to be of type %s" % (fn, fx.upper())
> ... 
> File C:\Users\Wulfraed\documents/BW-conversion\BW-1.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\BW-2.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\BW-3.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\BW-4.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\BWConv.html purports to be
> of type .HTML
> File C:\Users\Wulfraed\documents/BW-conversion\roo_b1.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\roo_b2.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\roo_b3.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\roo_b4.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\roo_b5.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\roo_b6.jpg purports to be of
> type .JPG
> File C:\Users\Wulfraed\documents/BW-conversion\roo_col.jpg purports to be
> of type .JPG
> >>> 
> -- 
>       Wulfraed                 Dennis Lee Bieber         AF6VN
>     wlfr...@ix.netcom.com    HTTP://wlfraed.home.netcom.com/
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list



Hi, thanks for all your replies. I realised pretty soon after I asked for help 
that I was trying to read the wrong amount of bytes and set about completely 
rewriting my code (after a coffee break)
import sys, os, binascii
def readfile():

    dictionary = {'474946':('GIF', 'gif'), 'ffd8ff':('JPEG', 'jpeg')}    try:   
     files = os.listdir('C:\\Temp\\downloads')                for item in 
files:            f = open('C:\\Temp\\downloads\\'+ item, 'r')            
file_sig = f.read(3)            file_sig_hex = binascii.hexlify(file_sig)       
                             if file_sig_hex in dictionary:                
print item + ' is a image file, it is a ' + file_sig
            else:                print item + ' is not an image file, it is' 
+file_sig
            print file_sig_hex
    
    except:        print 'Error. Try again'
    finally:        if 'f' in locals():            f.close()
def main():     readfile()
if __name__ == '__main__':    main()
As of right now my script prints out 'Error Try again' but when i comment out 
this part of the code;
          if file_sig_hex in dictionary:                print item + ' is a 
image file' + dictionary 
            else:                print item + ' is not an image file, is it' 
+dictionary 
            
it prints the file signatures to the screen, however what I'm trying to do with 
the if statement is tell me if the file is an image and give me is signature 
and if it is not, I want it to tell me and still give me it's signature and 
tell me what type of file it is. Can anyone point out an obvious error? 
RegardsJade                                       
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to