Hi Luke.

I really appreciate your efforts to attempt to reproduce the problem. I
think that the configs are right. I have been doing also a lot of tests and
with 1 server/node, the memory bucket works flawlessly, as your test. The
Riak cluster where we have the problem has a multi_backend with 1 memory
backend, 2 bitcask backends and 2 leveldb backends. I have only changed the
parameter connection of the memory backend in our production code to
another new "cluster" with only 1 node, with the same config of Riak but
with only 1 memory backend under the multi configuration and, as I said,
all fine, the problem vanished. I deduce that the problem appears only with
more than 1 node and with a lot of requests.

In my tests with the production cluster with the problem ( 4 nodes),
finally I realized that the TTL is working but, randomly and suddenly, KEYS
already deleted appear, and KEYS with correct TTL disappear :-? (Maybe
something related with the some ETS internal table? ) This is the moment
when I can obtain KEYS already expired.

In summary:

- With cluster with 4 nodes (config below): All OK for a while and suddenly
we lost the last 20 seconds approx. of keys and OLD keys appear in the
list: curl -X GET http://localhost:8098/buckets/ttl_stg/keys?keys=true

buckets.default.last_write_wins = true
bitcask.io_mode = erlang
multi_backend.ttl_stg.storage_backend = memory
multi_backend.ttl_stg.memory_backend.ttl = 90s
multi_backend.ttl_stg.memory_backend.max_memory_per_vnode = 25MB
anti_entropy = passive
ring_size = 256

- With 1 node: All OK

buckets.default.n_val = 1
buckets.default.last_write_wins = true
buckets.default.r = 1
buckets.default.w = 1
multi_backend. ttl_stg.storage_backend = memory
multi_backend. ttl_stg.memory_backend.ttl = 90s
multi_backend. ttl_stg.memory_backend.max_memory_per_vnode = 250MB
ring_size = 16



Another note: With this 1 node (32GB RAM) and only activated the memory
backend I have realized than the memory consumption grows without control:


# riak-admin  status|grep memory
memory_total : 17323130960
memory_processes : 235043016
memory_processes_used : 233078456
memory_system : 17088087944
memory_atom : 561761
memory_atom_used : 561127
memory_binary : 6737787976
memory_code : 14370908
memory_ets : 10295224544

# # riak-admin diag -d debug
[debug] Local RPC: os:getpid([]) [5000]
[debug] Running shell command: ps -o pmem,rss -p 17521
[debug] Shell command output:
%MEM   RSS
60.5 19863800

Wow 18.9GB when the max_memory_per_vnode = 250MB. Is far away from the
value,  250*16vnodes = 4000MB. Is it that correct?

This is the riak-admin vnode-status of 1 vnode, the other 15 are with
similar data:

VNode: 1370157784997721485815954530671515330927436759040
Backend: riak_kv_multi_backend
Status:
[{<<"ttl_stg">>,
  [{mod,riak_kv_memory_backend},
   {data_table_status,[{compressed,false},
                       {memory,1156673},
                       {owner,<8343.9466.104>},
                       {heir,none},

 {name,riak_kv_1370157784997721485815954530671515330927436759040},
                       {size,29656},
                       {node,'riak@xxxxxxxx'},
                       {named_table,false},
                       {type,ordered_set},
                       {keypos,1},
                       {protection,protected}]},
   {index_table_status,[{compressed,false},
                        {memory,89},
                        {owner,<8343.9466.104>},
                        {heir,none},

{name,riak_kv_1370157784997721485815954530671515330927436759040_i},
                        {size,0},
                        {node,'riak@xxxxxxxxx'},
                        {named_table,false},
                        {type,ordered_set},
                        {keypos,1},
                        {protection,protected}]},
   {time_table_status,[{compressed,false},
                       {memory,75968936},
                       {owner,<8343.9466.104>},
                       {heir,none},

 {name,riak_kv_1370157784997721485815954530671515330927436759040_t},
                       {size,2813661},
                       {node,'riak@xxxxxxxxx'},
                       {named_table,false},
                       {type,ordered_set},
                       {keypos,1},
                       {protection,protected}]}]}]

Thanks!

2014-10-13 22:30 GMT+02:00 Luke Bakken <lbak...@basho.com>:

> Hi Lucas,
>
> I've tried reproducing this using a local Riak 2.0.1 node, however TTL
> is working as expected.
>
> Here is the configuration I have in /etc/riak/riak.conf:
>
> storage_backend = multi
> multi_backend.default = bc_default
>
> multi_backend.ttl_stg.storage_backend = memory
> multi_backend.ttl_stg.memory_backend.ttl = 90s
> multi_backend.ttl_stg.memory_backend.max_memory_per_vnode = 4MB
>
> multi_backend.bc_default.storage_backend = bitcask
> multi_backend.bc_default.bitcask.data_root = /var/lib/riak/bc_default
> multi_backend.bc_default.bitcask.io_mode = erlang
>
> This translates to the following in
> /var/lib/riak/generated.configs/app.2014.10.13.13.13.29.config:
>
> {multi_backend_default,<<"bc_default">>},
> {multi_backend,
>     [{<<"ttl_stg">>,riak_kv_memory_backend,[{ttl,90},{max_memory,4}]},
>     {<<"bc_default">>,riak_kv_bitcask_backend,
>     [{io_mode,erlang},
>         {expiry_grace_time,0},
>         {small_file_threshold,10485760},
>         {dead_bytes_threshold,134217728},
>         {frag_threshold,40},
>         {dead_bytes_merge_trigger,536870912},
>         {frag_merge_trigger,60},
>         {max_file_size,2147483648},
>         {open_timeout,4},
>         {data_root,"/var/lib/riak/bc_default"},
>         {sync_strategy,none},
>         {merge_window,always},
>         {max_fold_age,-1},
>         {max_fold_puts,0},
>         {expiry_secs,-1},
>         {require_hint_crc,true}]}]}]},
>
> I set the bucket properties to use the ttl_stg backend:
>
> root@UBUNTU-12-1:~# cat ttl_stg-props.json
> {"props":{"name":"ttl_stg","backend":"ttl_stg"}}
>
> root@UBUNTU-12-1:~# curl -XPUT -H'Content-type: application/json'
> localhost:8098/buckets/ttl_stg/props --data-ascii @ttl_stg-props.json
>
> root@UBUNTU-12-1:~# curl -XGET localhost:8098/buckets/ttl_stg/props
>
> {"props":{"allow_mult":false,"backend":"ttl_stg","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dvv_enabled":false,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"ttl_stg","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","small_vclock":50,"w":"quorum","young_vclock":20}}
>
>
> And used the following statement to PUT test data:
>
> curl -XPUT localhost:8098/buckets/ttl_stg/keys/1 -d "TEST $(date)"
>
> After 90 seconds, this is the response I get from Riak:
>
> root@UBUNTU-12-1:~# curl -XGET localhost:8098/buckets/ttl_stg/keys/1
> not found
>
> I would carefully check all of the app.config / riak.conf files in
> your cluster, the output of "riak config effective" and the bucket
> properties for those buckets you expect to be using the memory backend
> with TTL. I also recommend using the localhost:8098/buckets/ endpoint
> instead of the deprecated riak/ endpoint.
>
> Please let me know if you have additional questions.
> --
> Luke Bakken
> Engineer / CSE
> lbak...@basho.com
>
>
> On Fri, Oct 3, 2014 at 11:32 AM, Lucas Grijander
> <lucasgrinjande...@gmail.com> wrote:
> > Hello,
> >
> > I have a memory backend in production with Riak 2.0.1, 4 servers and 256
> > vnodes. The servers have the same date and time.
> >
> > I have seen an odd performance with the ttl.
> >
> > This is the config:
> >
> >            {<<"ttl_stg">>,riak_kv_memory_backend,
> >             [{ttl,90},{max_memory,25}]},
> >
> > For example, see this GET response in one of the riak servers:
> >
> > < HTTP/1.1 200 OK
> > < X-Riak-Vclock: a85hYGBgzGDKBVIc4otdfgR/7bfIYEpkzGNlKI1efJYvCwA=
> > < Vary: Accept-Encoding
> > * Server MochiWeb/1.1 WebMachine/1.10.5 (jokes are better explained) is
> not
> > blacklisted
> > < Server: MochiWeb/1.1 WebMachine/1.10.5 (jokes are better explained)
> > < Link: </riak/ttl_stg>; rel="up"
> > < Last-Modified: Fri, 03 Oct 2014 17:40:05 GMT
> > < ETag: "3c8bGoifWcOCSVn0otD5nI"
> > < Date: Fri, 03 Oct 2014 17:47:50 GMT
> > < Content-Type: application/json
> > < Content-Length: 17
> >
> > If the TTL is 90 seconds, Why the GET doesn't return "not found" if the
> > difference between "Last-Modified" and "Date" (of the curl request) is
> > greater than the TTL?
> >
> > Thanks in advance!
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to