I have a multi-threaded application (a web service) where several threads need data from an external database. That data is quite a lot, but it is almost always the same. Between incoming requests, timestamped records get added to the DB.
So I decided to keep an in-memory cache of the DB records that gets only "topped up" with the most recent records on each request: from threading import Lock, Thread class MyCache(): def __init__(self): self.cache = None self.cache_lock = Lock() def _update(self): new_records = query_external_database() if self.cache is None: self.cache = new_records else: self.cache.extend(new_records) def get_data(self): with self.cache_lock: self._update() return self.cache my_cache = MyCache() # module level This works, but even those "small" queries can sometimes hang for a long time, causing incoming requests to pile up at the "with self.cache_lock" block. Since it is better to quickly serve the client with slightly outdated data than not at all, I came up with the "impatient" solution below. The idea is that an incoming request triggers an update query in another thread, waits for a short timeout for that thread to finish and then returns either updated or old data. class MyCache(): def __init__(self): self.cache = None self.thread_lock = Lock() self.update_thread = None def _update(self): new_records = query_external_database() if self.cache is None: self.cache = new_records else: self.cache.extend(new_records) def get_data(self): if self.cache is None: timeout = 10 # allow more time to get initial batch of data else: timeout = 0.5 with self.thread_lock: if self.update_thread is None or not self.update_thread.is_alive(): self.update_thread = Thread(target=self._update) self.update_thread.start() self.update_thread.join(timeout) return self.cache my_cache = MyCache() My question is: Is this a solid approach? Am I forgetting something? For instance, I believe that I don't need another lock to guard self.cache.append() because _update() can ever only run in one thread at a time. But maybe I'm overlooking something. -- https://mail.python.org/mailman/listinfo/python-list