Hi Maarten, I think you have hit the problem in the head. I do think this is feasible as I have observed that as size of cache increases, things do get better which might support your theory.
Is there a simple example you can add on "put a Deferred for the fetch operation ". I am really just getting started with twisted. Thanks for all the help. On Thu, Sep 26, 2019 at 11:40 PM Maarten ter Huurne <maar...@treewalker.org> wrote: > On Friday, 27 September 2019 04:38:46 CEST Waqar Khan wrote: > > Hi, > > What's a good way to use a simple dictionary as a cache in twisted > > framework? > > Basically, I have this callback chain where I ultimately make a rest > > call (in non-blocking way using treq) to fetch some data. But before > > I make the call, I am using a dictionary to see if the value is > > available or not. But, I have noticed that the event loop gets pretty > > busy(sometimes, things get stuck and twisted server stops) as soon as > > I add this logic.. Which is pretty much > > > > @defer.inlinecallbacks > > def fetch(key): > > if key in cache: > > return cache[key] > > # else call back to treq to fetch value > > cache[key] = value > > return value > > > > This dict can grow to around 50k.. What's a good way to solve this > > issue? > > If it gets stuck, then the cause for that is probably in the part of the > code you omitted. So it would help to elaborate on how the value is > fetched exactly. > > I can see two other problems with this caching mechanism though: > > 1. Items are never removed from the cache, so unless there is a limit to > the number of different keys that can be used, the cache can grow > indefinitely. You might want something like an LRU cache rather than a > plain dictionary. > > https://docs.python.org/3/library/functools.html#functools.lru_cache > > 2. If a lot of clients are requesting the same thing, you won't see any > benefits from caching until the first request completes. So you could > get a pattern like this: > > T=0: key A requested, A is not cached, start fetch #1 of A > T=1: key A requested, A is not cached, start fetch #2 of A > T=2: key A requested, A is not cached, start fetch #3 of A > T=3: key A requested, A is not cached, start fetch #4 of A > T=4: key A requested, A is not cached, start fetch #5 of A > T=5: fetch #1 of A completes and is added to the cache > T=6: key A requested, A is cached, return value immediately > > In this example, the value for A is fetched 5 times despite the caching > mechanism. If the fetching takes a long time compared to the rate at > which requests are coming in, this effect gets worse at a quadratic > rate: the total time spent fetching is the number of requests that come > in during the fetching of the first request times the duration of the > fetch. > > To avoid this, you could put a Deferred for the fetch operation in the > cache or in a separate dictionary and if you get another request for the > same key before the fetch completes, return that Deferred instead of > starting another fetch. > > Bye, > Maarten > > > > _______________________________________________ > Twisted-Python mailing list > Twisted-Python@twistedmatrix.com > https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python >
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python