Even though I am starting to get the hang of Python, I continue to find
myself finding problems that I cannot solve.
I have never used dictionaries before and I feel that they really help
improve efficiency when trying to analyze huge amounts of data (rather than
having nested loops).

Basically what I have is 2 different files containing data.  My program will
take the first line in one file and see if it exists in another file.  If it
does find a match, then it will write the data to a file.
---------------
Right now, the code will open file1 and store all contents in a list.  Then
it will do the same thing to file2.  THEEEEN it will loop over list1 and
insert into a Hash table.   I am trying to find out a way to make this code
more efficient.  SO here is what i would rather have.....  when i open file1
send directly to the hash table totally bypassing the insertion of the
script......  Is this possible?

def fcompare(f1name, f2name):
   import re
   mailsrch = re.compile(r'[EMAIL PROTECTED],4}')
   f1 = fopen(f1name)
   f2 = fopen(f2name)
   if not f1 or not f2:
       return 0
   a = f1.readlines(); f1.close()
   b = f2.readlines(); f2.close()
   file1List= []
   print "starting list 1"
   for c in a:
      file1List.extend(mailsrch.findall(c))
   print "storing File1 in dictionary."

   d1 = {}
   for item in file1List :
      d1[item] = None
 print "finished storing information in lists."

  print "starting list 2"
  file2List = []
  for d in b:
     file2List.extend(mailsrch.findall(d))

   utp = open("match.txt","w")
   for item in file2List :
      if d1.has_key( item ) :
         utp.write(item +  '\n')

   utp.close()
   #del file1List
   #del file2List
   print "finished comparing 2 lists."
   #return 1
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to