On Tue, Apr 24, 2018 at 3:24 AM, Hac4u <samakshkaus...@gmail.com> wrote: > I have a raw data of size nearly 10GB. I would like to find a text string and > print the memory address at which it is stored. > > This is my code > > import os > import re > filename="filename.dmp" > read_data=2**24 > searchtext="bd:mongo:" > he=searchtext.encode('hex')
Why encode it as hex? > with open(filename, 'rb') as f: > while True: > data= f.read(read_data) > if not data: > break > elif searchtext in data: > print "Found" > try: > offset=hex(data.index(searchtext)) > print offset > except ValueError: > print 'Not Found' > else: > continue You have a loop that reads a slab of data from a file, then searches the current data only. Then you search that again for the actual index, and print it - but you're printing the offset within the current chunk only. You'll need to maintain a chunk position in order to get the actual offset. Also, you're not going to find this if it spans across a chunk boundary. May need to cope with that. ChrisA -- https://mail.python.org/mailman/listinfo/python-list