Hi there, Python newbie here.
I am working with large files. For this reason I figured that I would capture the large input into a list and serialize it with pickle for later (faster) usage. Everything has worked beautifully until today when the large data (1GB) file caused a MemoryError :(
Question for experts: is there a way to refactor this so that data may be filled/written/released as the scripts go and avoid the problem?
code below. Thanks data = list() for line in sys.stdin: try: parts = line.strip().split("\t") t = parts[0] w = parts[1] u = parts[2] #let's retain in-memory copy of data data.append({"ta": t, "wa": w, "ua": u }) except IndexError: print("Problem with line :"+line, file=sys.stderr) pass #time to save data object into a pickle file fileObject = open(filename,"wb") pickle.dump(data,fileObject) fileObject.close() -- https://mail.python.org/mailman/listinfo/python-list