On Wed, Apr 27, 2016 at 4:44 PM, Niphlod <niph...@gmail.com> wrote: > on these "philosophical" notes, boring for someone, exciting for yours > truly, the are a few other pointers (just to point out that the concept of > "caching" isn't really something to discard beforehand)...be aware that on > "caching" there are probably thousands of posts and pages of books written > on the matter. > > As everything, it's a process that has "steps". Let's spend 10 seconds of > silence on the sentence "premature optimization is the root of all evil". > And another 10 on "There are only two hard things in Computer Science: > cache invalidation and naming things". Let those sink in. > > Ready? Let's go. > > Step #0 : assessment > > Consider an app that has a page that shows the records of a table that YOU > KNOW (as you are the developer) gets refreshed once a day (e.g. the > temperature recorded for LA for the previous day) > Or a page that shows the content of a row that never gets updated (e.g. a > blogpost) > Given that the single most expensive operation on a traditional webapp is > the database (just think to web2py requesting the data, the database > reading it from disk, preparing, sending it over the wire, web2py receiving > it) developers should always find a way to spend the least possible time on > those steps. > Optimizing queries (and/or database tuning, normalization, etc). Reducing > the number of needed queries to render a page. Requesting just the amount > of data needed (paging). Those are HUUUUGE topics (again, zillion of posts, > books, years of expertise to master, etc etc etc). But - of course - not > having to issue a query at all shortcircuits all of the above! > Still at step #0, as users come by your app, every request made to those > pages triggers the roundtrip to the database back and forth, always for the > same data, over and over. > Granted, 50 reqs/sec won't certainly hurt performances, but once they get > to 500, it'll become pretty obvious that "a" shortcircuit could save LOTS > of processing power. > When you face the problem of scaling to serve more concurrent requests, > either you do spawning more processes, or adding servers. > Adding frontend servers is easy: the data is transactionally consistent as > long as you have a single database instance. You put a load balancer in > front of frontends (it's relatively inexpensive) and go on. > Scaling databases adding servers is NEVER easy (againt, the interwebs and > libraries are full of evidences, and a big part of nosql "shiny" features > are indeed horizontal scaling, with pros and cons). > > Step #1: local caching > Back to your app without cache...wouldn't be better to avoid calling the > db for the same data 500 times per second ? Sure. Cache it. > > Assuming you cache the database results, web2py still needs to compute the > view, but that step, in regards of the shortcircuit, is orders of magnitude > less expensive. (yes, if you cache views, you're sidestepping web2py's > rendering too, but let's keep as less variables as possible for the sake of > this discussion) > And there you are, at the first iteration of step #1, using 1MB of RAM > more to avoid hitting the database. > Cache it for just an hour, do the math on the simple example of 50 req/s, > and you saved 50*60*60 - 1 = 179999 roundtrips. You can use the extra > savings to do 179999 roundtrips you actually NEED to in other places of > your app, and having the same performances, without additional costs > Whoa! > >
Step 1 to 300, where is step 2?? Just kidding, nice read thanks Simone... :) > Step #300 > You start caching here and there, and you use 500MB or RAM. You're using > cache.ram, everything is super-speedy, no third-parties, just web2py > features. > Now, you need to serve 100 req/s, you spawn another process.... whoopsie > .... 1GB of RAM. Or, another server, 500MB on the first and 500MB on the > second... 500 are "clearly" wasted, as they are a copy of the "original" > 500. > And the second process (or server) still needs to do roundtrips if its > local cache doesn't contain your already-cached-in-another-place query. > Also, something else "crepts in"...as your apps grows, you start loosing > track of what you cached, when you cached, for how long it's needed to be > cached... a record fetched on the first server at 8:00AM could be updated > in the meantime and fetched on the second server (because it isn't in its > local cache) at 8:02AM...you're effectively serving from cache different > versions! > > #Step 301 > To sidestep both issues, you use redis or memcached: they sit outside of > the web2py process and consume only 500MB. > For one, two, a zillion processes. And they are a single source of truth. > And they are as speedy as cache.ram (or at least, they have the same > magnitude). > > On Wednesday, April 27, 2016 at 2:04:07 PM UTC+2, Anthony wrote: >> >> On Wednesday, April 27, 2016 at 7:00:53 AM UTC-4, Pierre wrote: >>> >>> I'm impressed Anthony... >>> >>> well all of these - memcache-redis - seem to require lot's of >>> technicality and probably to set-up your own deployment machine. I am not >>> very enthusiastic about that option since the internet if full of endless >>> technical setup-config issues. Given what's being said, I see only two >>> kinds of page of my app I could cache : general informations and maybe >>> forms (no db().select()) all shared and uniform datas. >>> >> >> I'm not sure what you mean by caching forms, but you probably don't want >> to do that (at least if we're talking about web2py forms, which each >> include a unique hidden _formkey field to protect against CSRF attacks). >> >> >>> There should be a simple way to achieve such a simple thing whatever the >>> platform: pythonanywhere,vs......Is there one ? >>> >> >> You can just use cache.ram. If running uWSGI with multiple processes, you >> will have a separate cache for each, but that won't necessarily be a >> problem (just not as efficient as it could be). You could also try >> cache.disk and do some testing to see how it impacts performance. >> >> More generally, caching is something you do to improve efficiency, which >> becomes important as you start to have lots of traffic. But if you've got >> enough traffic where efficiency becomes so important, you should probably >> be willing (and hopefully able) to put in some extra effort to set up >> something like Memcached or Redis. Until you hit that point, don't worry >> about it. >> >> Anthony >> > -- > Resources: > - http://web2py.com > - http://web2py.com/book (Documentation) > - http://github.com/web2py/web2py (Source code) > - https://code.google.com/p/web2py/issues/list (Report Issues) > --- > You received this message because you are subscribed to the Google Groups > "web2py-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to web2py+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.