> Regarding the pros and cons of this approach (in comparison to David's >> database approach), I wonder what are the potential pitfalls/risks of the >> cache approach. For examples, but not limited to, >> >> 1. Is there any consistency issue for the data stored in web2py cache? >> > > Yes, this could be a problem. The nice thing about using a db with > transactions is that you can ensure any operations get rolled back if an > error occurs. In web2py, each request is wrapped in a transaction, so if > there is an error during the request, any db operations during that request > are rolled back. The other issues is volatility -- if your server goes > down, you lose the contents of RAM, but not what's stored in the db (or > written to a file). >
For each request, as long as it reaches the action without problem, then I suppose the subsequent updating to cache should be fairly straightforward. For the requests which carries the last piece of data to meet the trigger value, it will perform the processing as well as writing to the db, but as you said, this would be wrapped in a transaction. Regarding volatility, in my case, it doesn't matter if the data are written to the db at the point of crash - if it crashes, the whole process would have to start from scratch. > 3. Is it thread-safe? For instance, if I have two threads A and B (two >> requests from different users) trying to access the same object (e.g. >> 'user_data' dict) stored in the cache at the same time, would that cause >> any problem? This especially concerns the corner case where A and B bear >> the very last two pieces of data expected to meet ' >> some_pre_defined_number'. >> > > That's a good point -- your current design would introduce a potential > race condition problem (I was originally thinking each request would add > separate entries to the cache, not update a single object). Of course, a db > could have a similar problem if you were just repeatedly updating a single > record (rather than inserting new records each time). > I've looked deeper into this, and in web2py doc: http://www.web2py.com/examples/static/epydoc/web2py.gluon.cache.CacheInRam-class.html, it mentions: This is implemented as global (per process, shared by all threads) dictionary. A mutex-lock mechanism avoid conflicts. Does this mean that when each request thread is accessing and modifying the content (e.g. a dictionary in my case) of the cache, every other cache is blocked and has to wait till the current request thread finishes with it. If so, it seems to me that the race condition we fear as above should not happen? Please correct me if I get this wrong. Thanks.