Re: Reasons for using gen_server to gather statictics with folsom

Russell Brown Mon, 13 Aug 2012 01:26:59 -0700

Hi Sergey,
First, sorry for missing your first post. I just didn't see it.

I'll try and answer your questions.

> 1.       Why separate gen_servers (riak_api_stat, riak_core_stat, 
> riak_kv_stat) were used to gather statistics instead of the direct calls to 
> folsom_metrics through some more high-level api?

Time. There will be a high-level api provided by riak-core, or folsom, soon. 
The idea being that you declaratively register stats and riak-core will start 
a/some processes for you and you just use the API to update stats. I started 
work on this structure but didn't finish it in time for 1.2. It is what I am 
working on next. I'll keep you posted. If you follow the existing model 
hopefully porting to the new API will be relatively simple. Sorry for not 
getting it solidified sooner. The reason for gen_servers, of course, is to cast 
the calls to folsom rather than blocking on ets when doing critical Riak ops 
like writing and reading data. There are a number of table ownership/crashing 
issues in folsom, as well as a couple of race conditions. I'll be working with 
Joe Williams of Boundary to resolve these and refactor folsom as part of my 
ongoing stats work for Riak. Watch that repo to keep informed.

> 2.       What is the purpose of riak_core_stat_cache and what it is intended 
> to do?

Calculating the histograms for stats is expensive. Especially when there are a 
lot of readings. In some cases it can take a few seconds to calculate stats for 
some metrics on a busy node. The cache is there for 2 reasons. 1. To only have 
one process calculating stats at a time, so if multiple calls to get stats 
happen at once, one process actually calculates and the rest are parked and 
notified when the answer comes. 2. To actually cache the results so they're not 
calculated more often than needed. There are stats gathered on how long it 
takes to calculate stats, and the idea was to have the mean time to calculate 
stats for an application to be the cache TTL. That is work still to be done.

But in many ways the cache is there to support backwards compatibility for 
Riak's /stats endpoint and the riak-admin commands. In future I'd rather expose 
the folsom stats directly over REST and CLI so you can request only the stat 
you want and not waste time calculating a load of stats you're not interested 
in. This is the next, next thing I'll be working on. 

> As far as I understand riak_core_stat_cache caches stats using ets, so I’m 
> wondering why statistics that is stored in ets is cached using ets?

So why cache stats in ets that are already in ets: the cache is for groups of 
stats that have had the _expensive_ calculations run on them already, folsom 
stores the raw readings in ets.

> Is it correct that calls to folsom_metrics are done via gen_server to 
> decrease the possibility of losing ets tables that are bound to a concrete 
> process?

Really calls are done via gen_server so that calls to folsom are cast. 
Originally the code called folsom direct in process but bench marking showed 
this to be slower and more damaging in the case of an error/crash in folsom. I 
mention the ets ownership/crashing issues above. There is an example of one 
here[1]. I'm going to work on refactoring folsom to have a more coherent 
strategy of table ownership.

I hope this helps, if I've missed anything please ask. The short term aim was 
to stabilise stats in Riak and fix known issues, and I think I accomplished 
that. Next is to better structure the code so that riak-core provides a stats 
service.

Cheers

Russell

[1] https://github.com/boundary/folsom/issues/30

On 13 Aug 2012, at 09:54, Zhemzhitsky Sergey wrote:

> Hi guys,
>  
> Any updates on these questions?
>  
> I’ve read the following blog entry 
> http://basho.com/blog/technical/2012/07/02/folsom-backed-stats-riak-1-2/and 
> still haven’t found the answers.
>  
> As far as I understand riak_core_stat_cache caches stats using ets, so I’m 
> wondering why statistics that is stored in ets is cached using ets?
> Is it correct that calls to folsom_metrics are done via gen_server to 
> decrease the possibility of losing ets tables that are bound to a concrete 
> process?
>  
>  
> Best Regards,
> Sergey
>  
> From: riak-users-boun...@lists.basho.com 
> [mailto:riak-users-boun...@lists.basho.com] On Behalf Of Zhemzhitsky Sergey
> Sent: Friday, August 10, 2012 6:33 PM
> To: riak-users@lists.basho.com
> Subject: Reasons for using gen_server to gather statictics with folsom
>  
> Hi riak gurus,
>  
> Recently riak 1.2 has been released that uses folsom library to gather 
> statistics.
>  
> I’d like to use the same library (folsom) in my application so could you 
> answer the following questions:
>  
> 1.       Why separate gen_servers (riak_api_stat, riak_core_stat, 
> riak_kv_stat) were used to gather statistics instead of the direct calls to 
> folsom_metrics through some more high-level api?
> 2.       What is the purpose of riak_core_stat_cache and what it is intended 
> to do?
>  
>  
> Best Regards,
> Sergey
>  
> _______________________________________________________
> 
>  
> 
> The information contained in this message may be privileged and conf idential 
> and protected from disclosure. If you are not the original intended 
> recipient, you are hereby notified that any review, retransmission, 
> dissemination, or other use of, or taking of any action in reliance upon, 
> this information is prohibited. If you have received this communication in 
> error, please notify the sender immediately by replying to this message and 
> delete it from your computer. Thank you for your cooperation. Troika Dialog, 
> Russia.
> 
> If you need assistance please contact our Contact Center (+7495) 258 0500 or 
> go to www.troika.ru/eng/Contacts/system.wbp
> 
>  
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reasons for using gen_server to gather statictics with folsom

Reply via email to