Hi everyone, I have a few questions about my implementation, which doesn't make me totally happy.
Suppose I have a very long process, which during its executiong logs something, and the logs are is in n different files in the same directory. Now in the meanwhile I want to be able to do realtime analysis in python, so this is what I've done (simplifying): def main(): from multiprocessing import Value, Process is_over = Value('h', 0) Process(target=run, args=(conf, is_over)).start() # should also pass the directory with the results Process(target=analyze, args=(is_over, network, events, res_dir)).start() def run(): sim = subprocess.Popen(TEST_PAD, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out, err = sim.communicate() ret = sim.wait() # at this point the simulation is over, independently from the result print "simulation over, can also stop the others" is_over.value = 1 def analyze(): ... First of all, does it make sense to use multiprocessing and a short value as boolean to check if the simulation is over or not? Then the other problem is that I need to read many files, and the idea was a sort of "tail -f", but on all of them at the same time. Since I have to keep track of the status for each of them I ended up with something like this: class LogFileReader(object): def __init__(self, log_file): self.log_file = log_file self.pos = 0 def get_line(self): src = open(self.log_file) src.seek(self.pos) lines = src.readlines() self.pos = src.tell() return lines which I'm also not really sure it's the best way, then in analyze() I have a dictionary which keeps track of all the "readers" log_readers = {} for out in glob(res_dir + "/*.out"): node = FILE_OUT.match(out).group(1) nodeid = hw_to_nodeid(node) log_readers[nodeid] = LogFileReader(out) Maybe having more separate processes might be more clean, but since I have to merge the data it might be a mess... As last thing to know when to start to analyze the data I thought about this while len(listdir(res_dir)) < len(network): sleep(0.2) which in theory it should be correct, when there are enough files as the number of nodes in the network everything should be written. BUT once every 5 times I get an error, telling me one file doens't exists. That means that for listdir the file is already there but trying to access to it gives error, how is that possible? THanks a lot, and sorry for the long mail -- http://mail.python.org/mailman/listinfo/python-list