Re: [web2py] Re: caching questions

Richard Vézina Thu, 28 Apr 2016 07:28:45 -0700

On Wed, Apr 27, 2016 at 4:44 PM, Niphlod <niph...@gmail.com> wrote:

> on these "philosophical" notes, boring for someone, exciting for yours
> truly, the are a few other pointers (just to point out that the concept of
> "caching" isn't really something to discard beforehand)...be aware that on
> "caching" there are probably thousands of posts and pages of books written
> on the matter.
>
> As everything, it's a process that has "steps". Let's spend 10 seconds of
> silence on the sentence "premature optimization is the root of all evil".
> And another 10 on "There are only two hard things in Computer Science:
> cache invalidation and naming things". Let those sink in.
>
> Ready? Let's go.
>
> Step #0 : assessment
>
> Consider an app that has a page that shows the records of a table that YOU
> KNOW (as you are the developer) gets refreshed once a day (e.g. the
> temperature recorded for LA for the previous day)
> Or a page that shows the content of a row that never gets updated (e.g. a
> blogpost)
> Given that the single most expensive operation on a traditional webapp is
> the database (just think to web2py requesting the data, the database
> reading it from disk, preparing, sending it over the wire, web2py receiving
> it) developers should always find a way to spend the least possible time on
> those steps.
> Optimizing queries (and/or database tuning, normalization, etc). Reducing
> the number of needed queries to render a page. Requesting just the amount
> of data needed (paging). Those are HUUUUGE topics (again, zillion of posts,
> books, years of expertise to master, etc etc etc). But - of course - not
> having to issue a query at all shortcircuits all of the above!
> Still at step #0, as users come by your app, every request made to those
> pages triggers the roundtrip to the database back and forth, always for the
> same data, over and over.
> Granted, 50 reqs/sec won't certainly hurt performances, but once they get
> to 500, it'll become pretty obvious that "a" shortcircuit could save LOTS
> of processing power.
> When you face the problem of scaling to serve more concurrent requests,
> either you do spawning more processes, or adding servers.
> Adding frontend servers is easy: the data is transactionally consistent as
> long as you have a single database instance. You put a load balancer in
> front of frontends (it's relatively inexpensive) and go on.
> Scaling databases adding servers is NEVER easy (againt, the interwebs and
> libraries are full of evidences, and a big part of nosql "shiny" features
> are indeed horizontal scaling, with pros and cons).
>
> Step #1: local caching
> Back to your app without cache...wouldn't be better to avoid calling the
> db for the same data 500 times per second ? Sure. Cache it.
>
> Assuming you cache the database results, web2py still needs to compute the
> view, but that step, in regards of the shortcircuit, is orders of magnitude
> less expensive. (yes, if you cache views, you're sidestepping web2py's
> rendering too, but let's keep as less variables as possible for the sake of
> this discussion)
> And there you are, at the first iteration of step #1, using 1MB of RAM
> more to avoid hitting the database.
> Cache it for just an hour, do the math on the simple example of 50 req/s,
> and you saved 50*60*60 - 1 = 179999 roundtrips. You can use the extra
> savings to do 179999 roundtrips you actually NEED to in other places of
> your app, and having the same performances, without additional costs
> Whoa!
>
>


Step 1 to 300, where is step 2?? Just kidding, nice read thanks Simone...

:)



> Step #300
> You start caching here and there, and you use 500MB or RAM. You're using
> cache.ram, everything is super-speedy, no third-parties, just web2py
> features.
> Now, you need to serve 100 req/s, you spawn another process.... whoopsie
> .... 1GB of RAM. Or, another server, 500MB on the first and 500MB on the
> second... 500 are "clearly" wasted, as they are a copy of the "original"
> 500.
> And the second process (or server) still needs to do roundtrips if its
> local cache doesn't contain your already-cached-in-another-place query.
> Also, something else "crepts in"...as your apps grows, you start loosing
> track of what you cached, when you cached, for how long it's needed to be
> cached... a record fetched on the first server at 8:00AM could be updated
> in the meantime and fetched on the second server (because it isn't in its
> local cache) at 8:02AM...you're effectively serving from cache different
> versions!
>
> #Step 301
> To sidestep both issues, you use redis or memcached: they sit outside of
> the web2py process and consume only 500MB.
> For one, two, a zillion processes. And they are a single source of truth.
> And they are as speedy as cache.ram (or at least, they have the same
> magnitude).
>
> On Wednesday, April 27, 2016 at 2:04:07 PM UTC+2, Anthony wrote:
>>
>> On Wednesday, April 27, 2016 at 7:00:53 AM UTC-4, Pierre wrote:
>>>
>>> I'm impressed Anthony...
>>>
>>> well all of these - memcache-redis - seem to require lot's of
>>> technicality and probably to set-up your own deployment machine. I am not
>>> very enthusiastic about that option since the internet if full of endless
>>> technical setup-config issues. Given what's being said, I see only two
>>> kinds of page of my app I could cache : general informations and maybe
>>> forms (no db().select()) all shared and uniform datas.
>>>
>>
>> I'm not sure what you mean by caching forms, but you probably don't want
>> to do that (at least if we're talking about web2py forms, which each
>> include a unique hidden _formkey field to protect against CSRF attacks).
>>
>>
>>> There should be a simple way to achieve such a simple thing whatever the
>>> platform: pythonanywhere,vs......Is there one ?
>>>
>>
>> You can just use cache.ram. If running uWSGI with multiple processes, you
>> will have a separate cache for each, but that won't necessarily be a
>> problem (just not as efficient as it could be). You could also try
>> cache.disk and do some testing to see how it impacts performance.
>>
>> More generally, caching is something you do to improve efficiency, which
>> becomes important as you start to have lots of traffic. But if you've got
>> enough traffic where efficiency becomes so important, you should probably
>> be willing (and hopefully able) to put in some extra effort to set up
>> something like Memcached or Redis. Until you hit that point, don't worry
>> about it.
>>
>> Anthony
>>
> --
> Resources:
> - http://web2py.com
> - http://web2py.com/book (Documentation)
> - http://github.com/web2py/web2py (Source code)
> - https://code.google.com/p/web2py/issues/list (Report Issues)
> ---
> You received this message because you are subscribed to the Google Groups
> "web2py-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to web2py+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [web2py] Re: caching questions

Reply via email to