Virgil Stokes <v...@it.uu.se> writes: > While running a python program I need to save some of the data that is > being created. I would like to save the data to a file on a disk > according to a periodical schedule (e.g. every 10 > minutes). Initially, the amount of data is small (< 1 MB) but after > sometime the amount of data can be >10MB. If a problem occurs during > data creation, then the user should be able to start over from the > last successfully saved data. > > For my particular application, no other file is being saved and the > data should always replace (not be appended to) the previous data > saved. It is important that the data be saved without any obvious > distraction to the user who is busy creating more data. That is, I > would like to save the data "in the background". > > What is a good method to perform this task using Python 2.7.8 on a > Win32 platform?
There are several requirements: - save data asynchroniously -- "without any obvious distraction to the user" - save data durably -- avoid corrupting previously saved data or writing only partial new data e.g., in case of a power failure - do it periodically -- handle drift/overlap gracefully in a documented way A simple way to do asynchronios I/O on Python 2.7.8 on a Win32 platform is to use threads: t = threading.Thread(target=backup_periodically, kwargs=dict(period=600)) t.daemon = True # stop if the program exits t.start() where backup_periodically() backups data every period seconds: import time def backup_periodically(period, timer=time.time, sleep=time.sleep): start = timer() while True: try: backup() except Exception: # log exceptions and continue logging.exception() # lock with the timer sleep(period - (timer() - start) % period) To avoid drift over time of backup times, the sleep is locked with the timer using the modulo operation. If backup() takes longer than *period* seconds (unlikely for 10MB per 10 minutes) then the step may be skipped. backup() makes sure that the data is saved and can be restore at any time. def backup(): with atomic_open('backup', 'w') as file: file.write(get_data()) where atomic_open() [1] tries to overcome multiple issues with saving data reliably: - write to a temporary file so that the old data is always available - rename the file when all new data is written, handle cases such as: * "antivirus opens old file thus preventing me from replacing it" either the operation succeeds and 'backup' contains new data or it fails and 'backup' contains untouched ready-to-restore old data -- nothing in between. [1]: https://github.com/mitsuhiko/python-atomicfile/blob/master/atomicfile.py I don't know how ready atomicfile.py but you should be aware of the issues it is trying to solve if you want a reliable backup solution. -- Akira -- https://mail.python.org/mailman/listinfo/python-list