LevelDB parameter planning - max_open_files
Can someone please suggest how to understand the formula for open_file_memory on this page: http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/#Parameter-Planning 1. It definitely lacks some brackets, the correct formula is: OPEN_FILE_MEMORY = (max_open_files-10) * (184 + (average_sst_filesize/2048) * (8+((key_size+value_size)/2048 +1)*0.6)) 2. How to estimate average_sst_filesize? 3. does the result estimate the memory used by a single open file in any particular vnode? Or by a single vnode with max_open_files open? As max_open_files is a per vnode parameter then how to estimate the maximum memory used by leveldb if all vnodes have all max_open_files open? is it result*ring_size or result*ring_size*max_open_files? Thanks! -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Bitcask: Hintfile is invalid
Hi! I'm getting quite a lot of the errors like this: 2014-04-16 16:45:46.838 [error] <0.2110.0> Hintfile './data/fs_chunks/1370157784997721485815954530671515330927436759040/3.bitcask.hint' invalid running riak_2.0.0_pre20 What can be the reason and does it mean my data is corrupted? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Bitcask: Hintfile is invalid
Thank you! On 16 April 2014 17:09, Brian Sparrow wrote: > Hello Oleksiy, > > Hinfiles can become invalid for a variety of reasons but they are not a > required component for normal Riak operation. Hinfiles are used to hasten > bitcask startup and to optimize some fold operations. Hintfiles are created > during bitcask merge so after the associated data files(i.e 3.bitcask.data) > are merged these errors will go away. > > This log message should be a notice or warning level rather than error as > it is inappropriately alarming. I've filed an issue here[0]. > > -Brian > > [0] https://github.com/basho/bitcask/issues/164 > > > On Wed, Apr 16, 2014 at 9:52 AM, Oleksiy Krivoshey wrote: > >> Hi! >> >> I'm getting quite a lot of the errors like this: >> 2014-04-16 16:45:46.838 [error] <0.2110.0> Hintfile >> './data/fs_chunks/1370157784997721485815954530671515330927436759040/3.bitcask.hint' >> invalid >> >> running riak_2.0.0_pre20 >> >> What can be the reason and does it mean my data is corrupted? >> >> >> _______ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
ykGetIndex: Encountered unknown message tag
Hi! I'm trying to update yokozuna code (javascript, protocol buffers) which worked with pre11 for pre20 and I'm getting the following response when issuing RpbYokozunaIndexGetReq: error: undefined reply: { index: [ [Error: Encountered unknown message tag] ] } First problem is that error is returned in place of 'index' and the second is the error itself. What does it mean? -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: ykGetIndex: Encountered unknown message tag
The same error happens when using '_yz_default' schema with index. On 16 April 2014 22:03, Oleksiy Krivoshey wrote: > Hi! > > I'm trying to update yokozuna code (javascript, protocol buffers) which > worked with pre11 for pre20 and I'm getting the following response when > issuing RpbYokozunaIndexGetReq: > > error: undefined > reply: { index: [ [Error: Encountered unknown message tag] ] } > > First problem is that error is returned in place of 'index' and the second > is the error itself. What does it mean? > > > -- > Oleksiy > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: ykGetIndex: Encountered unknown message tag
Never mind. This was an error with underlying Javascript protobuf implementation. On 17 April 2014 18:33, Oleksiy Krivoshey wrote: > The same error happens when using '_yz_default' schema with index. > > > On 16 April 2014 22:03, Oleksiy Krivoshey wrote: > >> Hi! >> >> I'm trying to update yokozuna code (javascript, protocol buffers) which >> worked with pre11 for pre20 and I'm getting the following response when >> issuing RpbYokozunaIndexGetReq: >> >> error: undefined >> reply: { index: [ [Error: Encountered unknown message tag] ] } >> >> First problem is that error is returned in place of 'index' and the >> second is the error itself. What does it mean? >> >> >> -- >> Oleksiy >> > > > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
failed get after successful put
Hi guys, can someone please suggest what can be the reason for 'get' immediately following successful 'put' to fail? I'm running a fully connected, healthy, 5 node Riak 2.0-beta1 cluster. Using a multiple backend feature, so the order of operations is: 1. 'SetBucket' for a new bucket with a backend name. Wait for successful reply 2. 'Put' object to bucket. Wait for successful reply (return_head: true, so I get clock, vtag, etc back) 3. 'Get' object from step 2. Returns empty response (missing object) If I repeat 'Get' after just few moments - I will successfully retrieve the object. CAP options are all defaults (n: 3, etc). Happens with both custom bitcask and custom leveldb backends. There are no errors in riak logs, but a lot of the following: 2014-04-30 21:41:05.656 [info] <0.95.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.10689.74> [{initial_call,{erlang,apply,2}},{almost_current_function,{eval_bits,expr_grp,4}},{message_queue_len,0}] [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,394},{old_heap_size,0},{heap_size,28858704}] -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: failed get after successful put
The objects are really small (50 - 500 bytes). It doesn't happen if the bucket was already created. It also does't happen if I don't call SetBucket at all (so using default backend and options). And it seems it doesn't happen if I call SetBucket but don't set `backend` property. On 1 May 2014 00:51, Evan Vigil-McClanahan wrote: > Does this continue if the bucket hasn't been created recently? Does > it matter how large the object is? Is it particularly large in this > case? > > On Wed, Apr 30, 2014 at 3:47 PM, Oleksiy Krivoshey > wrote: > > Hi guys, > > > > can someone please suggest what can be the reason for 'get' immediately > > following successful 'put' to fail? > > > > I'm running a fully connected, healthy, 5 node Riak 2.0-beta1 cluster. > Using > > a multiple backend feature, so the order of operations is: > > > > 1. 'SetBucket' for a new bucket with a backend name. Wait for successful > > reply > > 2. 'Put' object to bucket. Wait for successful reply (return_head: true, > so > > I get clock, vtag, etc back) > > 3. 'Get' object from step 2. Returns empty response (missing object) > > > > If I repeat 'Get' after just few moments - I will successfully retrieve > the > > object. > > > > CAP options are all defaults (n: 3, etc). Happens with both custom > bitcask > > and custom leveldb backends. > > > > There are no errors in riak logs, but a lot of the following: > > > > 2014-04-30 21:41:05.656 [info] > > <0.95.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > > <0.10689.74> > > > [{initial_call,{erlang,apply,2}},{almost_current_function,{eval_bits,expr_grp,4}},{message_queue_len,0}] > > > [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,394},{old_heap_size,0},{heap_size,28858704}] > > > > -- > > Oleksiy > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: failed get after successful put
Yes, it really looks like the above scenario. It started happening when I only had no more than 20 custom buckets - is that too much? Whats the suggested limit? Will there be other issues when using a lot of buckets or is it just for the first gossip period? On 1 May 2014 01:12, Evan Vigil-McClanahan wrote: > I suspect that, given the large heap messages, you're seeing the known > issues where when a custom bucket is created and the moving that > metadata around the ring takes increasingly long. It isn't > recommended to create a large number of custom buckets at the moment. > > What the process likely looks like is this: > > 1. you do the set. > 2. bucket metadata starts being gossip around the ring. > 3. you do the put. it succeeds, but on some nodes, the metadata > hasn't been committed, so it's put into the default backend. > 4. gossip finishes. > 5. you do the get. it fails because now the bucket for this backend > has made 2 replicas unreachable. > 6. read repair happens, repopulating the 2 missing replicas. > 7. you re-get and it works. > > On Wed, Apr 30, 2014 at 4:04 PM, Oleksiy Krivoshey > wrote: > > The objects are really small (50 - 500 bytes). > > > > It doesn't happen if the bucket was already created. > > It also does't happen if I don't call SetBucket at all (so using default > > backend and options). > > And it seems it doesn't happen if I call SetBucket but don't set > `backend` > > property. > > > > > > > > On 1 May 2014 00:51, Evan Vigil-McClanahan > wrote: > >> > >> Does this continue if the bucket hasn't been created recently? Does > >> it matter how large the object is? Is it particularly large in this > >> case? > >> > >> On Wed, Apr 30, 2014 at 3:47 PM, Oleksiy Krivoshey > >> wrote: > >> > Hi guys, > >> > > >> > can someone please suggest what can be the reason for 'get' > immediately > >> > following successful 'put' to fail? > >> > > >> > I'm running a fully connected, healthy, 5 node Riak 2.0-beta1 cluster. > >> > Using > >> > a multiple backend feature, so the order of operations is: > >> > > >> > 1. 'SetBucket' for a new bucket with a backend name. Wait for > successful > >> > reply > >> > 2. 'Put' object to bucket. Wait for successful reply (return_head: > true, > >> > so > >> > I get clock, vtag, etc back) > >> > 3. 'Get' object from step 2. Returns empty response (missing object) > >> > > >> > If I repeat 'Get' after just few moments - I will successfully > retrieve > >> > the > >> > object. > >> > > >> > CAP options are all defaults (n: 3, etc). Happens with both custom > >> > bitcask > >> > and custom leveldb backends. > >> > > >> > There are no errors in riak logs, but a lot of the following: > >> > > >> > 2014-04-30 21:41:05.656 [info] > >> > <0.95.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > >> > <0.10689.74> > >> > > >> > > [{initial_call,{erlang,apply,2}},{almost_current_function,{eval_bits,expr_grp,4}},{message_queue_len,0}] > >> > > >> > > [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,394},{old_heap_size,0},{heap_size,28858704}] > >> > > >> > -- > >> > Oleksiy > >> > > >> > ___ > >> > riak-users mailing list > >> > riak-users@lists.basho.com > >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > > > > > > > > > > > -- > > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: failed get after successful put
Thanks! Using bucket types solved the problem. On 1 May 2014 01:44, Evan Vigil-McClanahan wrote: > The 2.0 solution to this is to create a bucket type > (http://docs.basho.com/riak/2.0.0beta1/dev/advanced/bucket-types/). > Once the type has been created pointed to that particular backend, any > buckets of that type created later will inherit that setting, so you > should be able to create as many of them as you like. > > On Wed, Apr 30, 2014 at 4:17 PM, Oleksiy Krivoshey > wrote: > > Yes, it really looks like the above scenario. It started happening when > I > > only had no more than 20 custom buckets - is that too much? Whats the > > suggested limit? Will there be other issues when using a lot of buckets > or > > is it just for the first gossip period? > > > > > > On 1 May 2014 01:12, Evan Vigil-McClanahan > wrote: > >> > >> I suspect that, given the large heap messages, you're seeing the known > >> issues where when a custom bucket is created and the moving that > >> metadata around the ring takes increasingly long. It isn't > >> recommended to create a large number of custom buckets at the moment. > >> > >> What the process likely looks like is this: > >> > >> 1. you do the set. > >> 2. bucket metadata starts being gossip around the ring. > >> 3. you do the put. it succeeds, but on some nodes, the metadata > >> hasn't been committed, so it's put into the default backend. > >> 4. gossip finishes. > >> 5. you do the get. it fails because now the bucket for this backend > >> has made 2 replicas unreachable. > >> 6. read repair happens, repopulating the 2 missing replicas. > >> 7. you re-get and it works. > >> > >> On Wed, Apr 30, 2014 at 4:04 PM, Oleksiy Krivoshey > >> wrote: > >> > The objects are really small (50 - 500 bytes). > >> > > >> > It doesn't happen if the bucket was already created. > >> > It also does't happen if I don't call SetBucket at all (so using > default > >> > backend and options). > >> > And it seems it doesn't happen if I call SetBucket but don't set > >> > `backend` > >> > property. > >> > > >> > > >> > > >> > On 1 May 2014 00:51, Evan Vigil-McClanahan > >> > wrote: > >> >> > >> >> Does this continue if the bucket hasn't been created recently? Does > >> >> it matter how large the object is? Is it particularly large in this > >> >> case? > >> >> > >> >> On Wed, Apr 30, 2014 at 3:47 PM, Oleksiy Krivoshey < > oleks...@gmail.com> > >> >> wrote: > >> >> > Hi guys, > >> >> > > >> >> > can someone please suggest what can be the reason for 'get' > >> >> > immediately > >> >> > following successful 'put' to fail? > >> >> > > >> >> > I'm running a fully connected, healthy, 5 node Riak 2.0-beta1 > >> >> > cluster. > >> >> > Using > >> >> > a multiple backend feature, so the order of operations is: > >> >> > > >> >> > 1. 'SetBucket' for a new bucket with a backend name. Wait for > >> >> > successful > >> >> > reply > >> >> > 2. 'Put' object to bucket. Wait for successful reply (return_head: > >> >> > true, > >> >> > so > >> >> > I get clock, vtag, etc back) > >> >> > 3. 'Get' object from step 2. Returns empty response (missing > object) > >> >> > > >> >> > If I repeat 'Get' after just few moments - I will successfully > >> >> > retrieve > >> >> > the > >> >> > object. > >> >> > > >> >> > CAP options are all defaults (n: 3, etc). Happens with both custom > >> >> > bitcask > >> >> > and custom leveldb backends. > >> >> > > >> >> > There are no errors in riak logs, but a lot of the following: > >> >> > > >> >> > 2014-04-30 21:41:05.656 [info] > >> >> > <0.95.0>@riak_core_sysmon_handler:handle_event:92 monitor > large_heap > >> >> > <0.10689.74> > >> >> > > >> >> > > >> >> > > [{initial_call,{erlang,apply,2}},{almost_current_function,{eval_bits,expr_grp,4}},{message_queue_len,0}] > >> >> > > >> >> > > >> >> > > [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,394},{old_heap_size,0},{heap_size,28858704}] > >> >> > > >> >> > -- > >> >> > Oleksiy > >> >> > > >> >> > ___ > >> >> > riak-users mailing list > >> >> > riak-users@lists.basho.com > >> >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> >> > > >> > > >> > > >> > > >> > > >> > -- > >> > Oleksiy Krivoshey > > > > > > > > > > -- > > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Map-reduce + bucket type
Hi! Please suggest how to run a map/reduce over all keys in a bucket with custom bucket type? -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Map-reduce + bucket type
Thank you! On 3 May 2014 17:25, Brian Roach wrote: > riak-users@lists.basho.com -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Loosing data with bucket types, custom backend and active data entropy
Hi, I have a quite rare problem of lost data in Riak 2.0 beta1. I can hardly replicate it but when it happens it looks like this order of operations: (All operations are using bucket types). 1. write some data (KEY1) - ok 2. read that data (KEY1) - ok 3. message appears in riak console.log: 2014-05-21 08:15:03.328 [info] <0.15793.157>@riak_kv_exchange_fsm:key_exchange:256 Repaired 1 keys during active anti-entropy exchange of {1450083655789255239155218544960687058564870569984,3} between {0,' riak@10.0.1.1'} and {1450083655789255239155218544960687058564870569984,' riak@10.0.1.3'} 4. read data (KEY1) - ok 5. write new data (KEY1) - ok 6. read data (KEY1) - no such key All happens within 10-20 seconds. Can someone give any hint on this? Running riak2.0.0_beta1 on Ubuntu 14.04 Bucket type: {"props":{"backend":"fs_chunks","allow_mult":"false"}} fs_chunks backend: {<<"fs_chunks">>, riak_kv_bitcask_backend, [ {data_root, "/var/lib/riak/fs_chunks"} ]} Thanks! -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Loosing data with bucket types, custom backend and active data entropy
I think its a different issue and might be my own misunderstanding: The actual order of operations is (all same key): 1. write 3. read 4. delete 5. write 6. read - failed So it might be tombstone problem. However I always do 'get' with 'deletedvclock: true' before 'put' or 'delete' and provide a vclock. On 21 May 2014 12:10, Oleksiy Krivoshey wrote: > Hi, > > I have a quite rare problem of lost data in Riak 2.0 beta1. I can hardly > replicate it but when it happens it looks like this order of operations: > > (All operations are using bucket types). > > 1. write some data (KEY1) - ok > 2. read that data (KEY1) - ok > 3. message appears in riak console.log: > 2014-05-21 08:15:03.328 [info] > <0.15793.157>@riak_kv_exchange_fsm:key_exchange:256 Repaired 1 keys during > active anti-entropy exchange of > {1450083655789255239155218544960687058564870569984,3} between {0,' > riak@10.0.1.1'} and {1450083655789255239155218544960687058564870569984,' > riak@10.0.1.3'} > 4. read data (KEY1) - ok > 5. write new data (KEY1) - ok > 6. read data (KEY1) - no such key > > All happens within 10-20 seconds. > > Can someone give any hint on this? > > Running riak2.0.0_beta1 on Ubuntu 14.04 > > Bucket type: > > {"props":{"backend":"fs_chunks","allow_mult":"false"}} > > fs_chunks backend: > > {<<"fs_chunks">>, riak_kv_bitcask_backend, [ > {data_root, "/var/lib/riak/fs_chunks"} > ]} > > Thanks! > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Loosing data with bucket types, custom backend and active data entropy
I think I've found that rare case when I don't get deletedvlock before 'put' in my code. Sorry for bothering everyone :) On 21 May 2014 14:15, Oleksiy Krivoshey wrote: > I think its a different issue and might be my own misunderstanding: > > The actual order of operations is (all same key): > > 1. write > 3. read > 4. delete > 5. write > 6. read - failed > > So it might be tombstone problem. However I always do 'get' with > 'deletedvclock: true' before 'put' or 'delete' and provide a vclock. > > > On 21 May 2014 12:10, Oleksiy Krivoshey wrote: > >> Hi, >> >> I have a quite rare problem of lost data in Riak 2.0 beta1. I can hardly >> replicate it but when it happens it looks like this order of operations: >> >> (All operations are using bucket types). >> >> 1. write some data (KEY1) - ok >> 2. read that data (KEY1) - ok >> 3. message appears in riak console.log: >> 2014-05-21 08:15:03.328 [info] >> <0.15793.157>@riak_kv_exchange_fsm:key_exchange:256 Repaired 1 keys during >> active anti-entropy exchange of >> {1450083655789255239155218544960687058564870569984,3} between {0,' >> riak@10.0.1.1'} and {1450083655789255239155218544960687058564870569984,' >> riak@10.0.1.3'} >> 4. read data (KEY1) - ok >> 5. write new data (KEY1) - ok >> 6. read data (KEY1) - no such key >> >> All happens within 10-20 seconds. >> >> Can someone give any hint on this? >> >> Running riak2.0.0_beta1 on Ubuntu 14.04 >> >> Bucket type: >> >> {"props":{"backend":"fs_chunks","allow_mult":"false"}} >> >> fs_chunks backend: >> >> {<<"fs_chunks">>, riak_kv_bitcask_backend, [ >> {data_root, "/var/lib/riak/fs_chunks"} >> ]} >> >> Thanks! >> >> -- >> Oleksiy Krivoshey >> > > > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Loosing data with bucket types, custom backend and active data entropy
Thanks! The problem indeed was with missing vclock (obtained with deletedvclock: true) in the write operation immediately following delete. On 21 May 2014 17:23, Matthew Dawson wrote: > I've seen a similar failure case when I was playing around with Riak. I > found > the solution is to do a read before 5 with r=n (r=3 in the default case), > and > then do the write with the returned vclock, if any is returned. Try that > and > see if it helps. > -- > Matthew > > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
riak_pipe_fitting error
Hi! Once in a few days I got the following error in my Riak cluster: 2014-10-25 03:01:23.731 [error] <0.221.0> Supervisor riak_pipe_fitting_sup had child undefined started with riak_pipe_fitting:start_link() at <0.22692.2455> exit with reason noproc in context shutdown_err or 2014-10-25 05:00:09.896 [error] <0.221.0> Supervisor riak_pipe_fitting_sup had child undefined started with riak_pipe_fitting:start_link() at <0.27111.2457> exit with reason noproc in context shutdown_err or The client application exits with a connection timeout error when hitting this error. Please suggest what does this mean and how to fix it? Thanks! -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: riak_pipe_fitting error
Thanks! The map/reduce function being executed is map_get_old_processed_files/3: https://gist.github.com/oleksiyk/abdb48ebd4554d3a40e1 The request is (in Javascript): var request = { inputs: ['fs_files', accountId + '.processed.files'], query: [{ map: { module: 'riakfs_storage_stats', function: 'map_get_old_processed_files', language: 'erlang', keep: true, arg: from.toDate().toISOString() } }] }; The bucket is of a custom type, but nothing special, just sets a LevelDB backend with default options. On 26 October 2014 02:28, Christopher Meiklejohn < christopher.meiklej...@gmail.com> wrote: > Hi, > > Can you please send the map-reduce job you tried to run and I would be > happy to debug it. > > Chris > > > On Oct 25, 2014, at 22:51, Oleksiy Krivoshey wrote: > > > > Hi! > > > > Once in a few days I got the following error in my Riak cluster: > > > > 2014-10-25 03:01:23.731 [error] <0.221.0> Supervisor > riak_pipe_fitting_sup had child undefined started with > riak_pipe_fitting:start_link() at <0.22692.2455> exit with reason noproc in > context shutdown_err > > or > > 2014-10-25 05:00:09.896 [error] <0.221.0> Supervisor > riak_pipe_fitting_sup had child undefined started with > riak_pipe_fitting:start_link() at <0.27111.2457> exit with reason noproc in > context shutdown_err > > or > > > > The client application exits with a connection timeout error when > hitting this error. > > > > Please suggest what does this mean and how to fix it? > > > > Thanks! > > > > -- > > Oleksiy > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: riak_pipe_fitting error
It is riak_2.0.0-1 running as 5 node cluster on Ubuntu 14.04 On 26 October 2014 10:02, Christopher Meiklejohn < christopher.meiklej...@gmail.com> wrote: > Hi Oleksiy, > > What version of Riak are you running? > > - Chris > > > On Oct 26, 2014, at 6:20 AM, Oleksiy Krivoshey > wrote: > > > > Thanks! The map/reduce function being executed is > map_get_old_processed_files/3: > > > > https://gist.github.com/oleksiyk/abdb48ebd4554d3a40e1 > > > > The request is (in Javascript): > > > > var request = { > > inputs: ['fs_files', accountId + '.processed.files'], > > query: [{ > > map: { > > module: 'riakfs_storage_stats', > > function: 'map_get_old_processed_files', > > language: 'erlang', > > keep: true, > > arg: from.toDate().toISOString() > > } > > }] > > }; > > > > The bucket is of a custom type, but nothing special, just sets a LevelDB > backend with default options. > > > > On 26 October 2014 02:28, Christopher Meiklejohn < > christopher.meiklej...@gmail.com> wrote: > > Hi, > > > > Can you please send the map-reduce job you tried to run and I would be > happy to debug it. > > > > Chris > > > > > On Oct 25, 2014, at 22:51, Oleksiy Krivoshey > wrote: > > > > > > Hi! > > > > > > Once in a few days I got the following error in my Riak cluster: > > > > > > 2014-10-25 03:01:23.731 [error] <0.221.0> Supervisor > riak_pipe_fitting_sup had child undefined started with > riak_pipe_fitting:start_link() at <0.22692.2455> exit with reason noproc in > context shutdown_err > > > or > > > 2014-10-25 05:00:09.896 [error] <0.221.0> Supervisor > riak_pipe_fitting_sup had child undefined started with > riak_pipe_fitting:start_link() at <0.27111.2457> exit with reason noproc in > context shutdown_err > > > or > > > > > > The client application exits with a connection timeout error when > hitting this error. > > > > > > Please suggest what does this mean and how to fix it? > > > > > > Thanks! > > > > > > -- > > > Oleksiy > > > ___ > > > riak-users mailing list > > > riak-users@lists.basho.com > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > -- > > Oleksiy Krivoshey > > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Riak errors after node left the cluster
Hi, I'm running a 5 node cluster (Riak 2.0.0) and I had to replace hardware on one of the servers. So I did a 'cluster leave', waited till the node exited, checked the ring status and members status, all was ok, with no pending changes. Then later after about 5 minutes every client connection to any of the 4 remaining nodes started to fail with [Error: {error,mailbox_overload} I have restarted one node after another and the error has gone. However I was still experiencing connectivity issues (timeouts) and riak error log is full of various errors even after I joined the 5th node back. Error are like: Failed to merge {["/var/lib/riak/bitcask_expire_1d/685078892498860742907977265335757665463718379520/1.bitcask.data"] gen_fsm <0.818.0> in state active terminated with reason: bad record state in riak_kv_vnode:set_vnode_forwarding/2 line 991 @riak_pipe_vnode:new_worker:826 Pipe worker startup failed: msg,7,[{file,"gen_fsm.erl"},{line,505}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}] 2014-11-04 16:07:57.124 [error] <0.11128.0>@riak_core_handoff_sender:start_fold:279 hinted_handoff transfer of riak_kv_vnode from 'riak@10.0.1.1' 353957427791078050502454920423474793822921162752 to 'riak@ 10.0.1.5' 353957427791078050502454920423474793822921162752 failed because of error:undef [{riak_core_format,human_size_fmt,["~.2f",588],[]},{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_han doff_sender.erl"},{line,246}]}] The full error log file is available here: https://www.dropbox.com/s/3b8x3nqyego7lw3/error.log?dl=0 There was no significant load on Riak so I would like to understand what caused so many errors? -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak errors after node left the cluster
There were also errors during initial handoff, here is a full console.log for that day: https://www.dropbox.com/s/o7zop181pvpxoa5/console.log?dl=0 I actually replaced two nodes that day. First one went smoothly as it should. The second one resulted in the situation above. I replaced the first one and then the second after few hours. On 4 November 2014 20:44, Oleksiy Krivoshey wrote: > Hi, > > I'm running a 5 node cluster (Riak 2.0.0) and I had to replace hardware on > one of the servers. So I did a 'cluster leave', waited till the node > exited, checked the ring status and members status, all was ok, with no > pending changes. Then later after about 5 minutes every client connection > to any of the 4 remaining nodes started to fail with > > [Error: {error,mailbox_overload} > > I have restarted one node after another and the error has gone. However I > was still experiencing connectivity issues (timeouts) and riak error log is > full of various errors even after I joined the 5th node back. > > Error are like: > > Failed to merge > {["/var/lib/riak/bitcask_expire_1d/685078892498860742907977265335757665463718379520/1.bitcask.data"] > > gen_fsm <0.818.0> in state active terminated with reason: bad record state > in riak_kv_vnode:set_vnode_forwarding/2 line 991 > > @riak_pipe_vnode:new_worker:826 Pipe worker startup failed: > > > msg,7,[{file,"gen_fsm.erl"},{line,505}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}] > 2014-11-04 16:07:57.124 [error] > <0.11128.0>@riak_core_handoff_sender:start_fold:279 hinted_handoff transfer > of riak_kv_vnode from 'riak@10.0.1.1' > 353957427791078050502454920423474793822921162752 to 'riak@ > 10.0.1.5' 353957427791078050502454920423474793822921162752 failed because > of error:undef > [{riak_core_format,human_size_fmt,["~.2f",588],[]},{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_han > doff_sender.erl"},{line,246}]}] > > The full error log file is available here: > https://www.dropbox.com/s/3b8x3nqyego7lw3/error.log?dl=0 > > There was no significant load on Riak so I would like to understand what > caused so many errors? > > -- > Oleksiy > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak errors after node left the cluster
Just got a new problem with Riak. Recently a hard drive has failed on one of Riak nodes so I had to shut it down. I'm running 4 nodes now and each 10 minutes all of them start to fail with 'Error: {error,mailbox_overload}' until restarted. Can anyone from Basho please suggest a solution/ fix for this? My whole cluster is unusable with just 1 node failed. On 5 November 2014 00:11, Oleksiy Krivoshey wrote: > There were also errors during initial handoff, here is a full console.log > for that day: https://www.dropbox.com/s/o7zop181pvpxoa5/console.log?dl=0 > > I actually replaced two nodes that day. First one went smoothly as it > should. The second one resulted in the situation above. I replaced the > first one and then the second after few hours. > > On 4 November 2014 20:44, Oleksiy Krivoshey wrote: > >> Hi, >> >> I'm running a 5 node cluster (Riak 2.0.0) and I had to replace hardware >> on one of the servers. So I did a 'cluster leave', waited till the node >> exited, checked the ring status and members status, all was ok, with no >> pending changes. Then later after about 5 minutes every client connection >> to any of the 4 remaining nodes started to fail with >> >> [Error: {error,mailbox_overload} >> >> I have restarted one node after another and the error has gone. However I >> was still experiencing connectivity issues (timeouts) and riak error log is >> full of various errors even after I joined the 5th node back. >> >> Error are like: >> >> Failed to merge >> {["/var/lib/riak/bitcask_expire_1d/685078892498860742907977265335757665463718379520/1.bitcask.data"] >> >> gen_fsm <0.818.0> in state active terminated with reason: bad record >> state in riak_kv_vnode:set_vnode_forwarding/2 line 991 >> >> @riak_pipe_vnode:new_worker:826 Pipe worker startup failed: >> >> >> msg,7,[{file,"gen_fsm.erl"},{line,505}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}] >> 2014-11-04 16:07:57.124 [error] >> <0.11128.0>@riak_core_handoff_sender:start_fold:279 hinted_handoff transfer >> of riak_kv_vnode from 'riak@10.0.1.1' >> 353957427791078050502454920423474793822921162752 to 'riak@ >> 10.0.1.5' 353957427791078050502454920423474793822921162752 failed because >> of error:undef >> [{riak_core_format,human_size_fmt,["~.2f",588],[]},{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_han >> doff_sender.erl"},{line,246}]}] >> >> The full error log file is available here: >> https://www.dropbox.com/s/3b8x3nqyego7lw3/error.log?dl=0 >> >> There was no significant load on Riak so I would like to understand what >> caused so many errors? >> >> -- >> Oleksiy >> > > > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak errors after node left the cluster
I have actually described two recent situations here. In the first one, there were no failed nodes or servers. I did a clear 'leave' from 1 node and waited till it exited. I upgraded hardware on it and started the server again (with clean riak setup), then I've done 'join'. I repeated this after few hours on second server, doing clean 'leave'. Thats when I started experiencing mailbox_overload errors. In the second situation, which happened 2 days after, a hard drive have failed on a 1 server so I had to do 'down' and 'force-remove' for this node. It was ok when the server was marked as down. When I brought new server back these errors started again. This time it lasted for 5 hours in total making my Riak cluster and application unavailable. I did all sorts of monitoring when I was able to do so (trying to keep application live) and it seems that there was just the AAE exchanging (repairing) keys by thousands. If I was disabling the AAE by making it passive I was able to make my application work (with some limitations of read repair). As soon as I switched AAE back to active - I was getting thousands of mailbox_overload errors. I was trying to configure mailbox tiers throttle with no luck. I'm running on a quite good hardware as I thought (64GB ram, RAID10, Intel Xeon hex-core and private gigabit network just for Riak). On 7 November 2014 06:06, Scott Lystig Fritchie wrote: > Sargun Dhillon wrote: > > sd> Can you run: [...] > > Hi, Sargun and Oleksiy. Those commands and a lot more are run as part > of the suite of info-gathering done by the "riak-debug" utility. I > recommend using it instead of managing a hodge-podge of separate > commands. > > The output from "riak-admin cluster-info" is also exceptionally helpful, > especially because it contains even more diagnostic information, > especially about Erlang process mailbox contents. I recommend running > it during overload conditions to see what's going on internally. > > Also, "riak-admin top -sort msg_q" can give a real-time view of Erlang > mailbox sizes, sorted by mailbox size. > > -Scott > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
bitcask io mode
Hi! Can anyone please explain in more details what kind of negative impact has the 'nif' bitcask IO mode and what worst-case scenarios can increase latencies or cause IO collapses? http://docs.basho.com/riak/latest/ops/advanced/backends/bitcask/#Configuring-Bitcask "In general, the nif IO mode provies higher throughput for certain workloads, but it has the potential to negatively impact the Erlang VM, leading to higher worst-case latencies and possible throughput collapse." Thanks! -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: bitcask io mode
"The io_mode setting specifies which code module Bitcask should use for file access. The available settings are: erlang (default) — Writes are made via Erlang's built-in file API nif — Writes are made via direct calls to the POSIX C API" These sentences mention 'writes' only, does it affect other file IO (reads, stats)? On 10 November 2014 20:02, Oleksiy Krivoshey wrote: > Hi! > > Can anyone please explain in more details what kind of negative impact has > the 'nif' bitcask IO mode and what worst-case scenarios can increase > latencies or cause IO collapses? > > > http://docs.basho.com/riak/latest/ops/advanced/backends/bitcask/#Configuring-Bitcask > > "In general, the nif IO mode provies higher throughput for certain > workloads, but it has the potential to negatively impact the Erlang VM, > leading to higher worst-case latencies and possible throughput collapse." > > Thanks! > > -- > Oleksiy > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: bitcask io mode
Thanks! What kind of negative impact can the NIF mode cause on Bitcask? On 14 November 2014 08:39, Scott Lystig Fritchie wrote: > Oops, sorry, I overlooked your question. The erlang/nif I/O setting > affect all Bitcask file I/O, see > https://github.com/basho/bitcask/blob/develop/src/bitcask_io.erl > > -Scott > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Yokozuna setup with many buckets
Hi, Can anyone please suggest what will be the best setup of Yokozuna (I mean indexing/search performance) if I have many buckets of the same bucket-type: 1. having 1 yokozuna index associated with a bucket-type (e.g. with all buckets) 2. having separate yokozuna index created and associated with each bucket In my setup I will always have to search within a single (specified) bucket only so that search results from other buckets do not mix together. Thanks! -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Yokozuna error during indexing
Hi, I have enabled Yokozuna on existing Riak 2.0 buckets and while it is still indexing everything I've already received about 50 errors like this: emulator Error in process <0.26807.79> on node 'riak@10.0.1.1' with exit value: {{badmatch,false},[{base64,decode_binary,2,[{file,"base64.erl"},{line,211}]},{yz_solr,to_pair,1,[{file,"src/yz_solr.erl"},{line,414}]},{yz_solr,'-get_pairs/1-lc$^0/1-0-',1,[{file,"src/yz_solr.erl"},{line,411}]},{yz_solr,'-get_pairs/1-lc$^0/1-0-'... Can someone please describe what does it mean? -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
Yes, I'm using custom schema and custom bucket type. There are many (over 500) buckets of this type. Your command have returned the following tuple: (riak@10.0.1.1)1> redbug:start("yz_solr:to_pair -> return"). {1919,1} redbug done, timeout - 0 How do I get bucket/key/base64_string from this? On 21 November 2014 21:41, Ryan Zezeski wrote: > redbug:start("yz_solr:to_pair -> return"). -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
Thanks, I'll try to catch it. Still, what kind of base64 string can it be? I don't have anything base64 encoded in my data, its a pure JSON objects stored with content_type: 'application/json' On 21 November 2014 21:53, Ryan Zezeski wrote: > redbug:start("yz_solr:to_pair -> return", [{time, 60}]). -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Yokozuna - building entropy trees
Hi, How long does it usually take to create entropy trees? I saw in a manual that they are created 1 per hour but mine seem to be stuck for 9 hours: 11417981541647679048466287755595961091061972992-- 57089907708238395242331438777979805455309864960-- 102761833874829111436196589800363649819557756928 -- 148433760041419827630061740822747494183805648896 -- 194105686208010543823926891845131338548053540864 10.4 hr 239777612374601260017792042867515182912301432832 -- 285449538541191976211657193889899027276549324800 -- 650824947873917705762578402068969782190532460544 -- 696496874040508421956443553091353626554780352512 -- 742168800207099138150308704113737470919028244480 -- 787840726373689854344173855136121315283276136448 -- 833512652540280570538039006158505159647524028416 -- 879184578706871286731904157180889004011771920384 -- 924856504873462002925769308203272848376019812352 -- 970528431040052719119634459225656692740267704320 -- 1016200357206643435313499610248040537104515596288 -- 1061872283373234151507364761270424381468763488256 9.4 hr 1107544209539824867701229912292808225833011380224 -- 1153216135706415583895095063315192070197259272192 -- 119061873006300088960214337575914561507164160 12.1 hr 1244559988039597016282825365359959758925755056128 -- 1290231914206187732476690516382343603290002948096 -- 1335903840372778448670555667404727447654250840064 11.4 hr 1381575766539369164864420818427111292018498732032 -- 1427247692705959881058285969449495136382746624000 -- Thanks! -- Oleksiy ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
970528431040052719119634459225656692740267704320 0 0 0 1016200357206643435313499610248040537104515596288 0 0 0 1061872283373234151507364761270424381468763488256 0 0 0 1107544209539824867701229912292808225833011380224 0 0 0 1153216135706415583895095063315192070197259272192 0 0 0 119061873006300088960214337575914561507164160 0 0 0 1244559988039597016282825365359959758925755056128 0 0 0 1290231914206187732476690516382343603290002948096 0 0 0 1335903840372778448670555667404727447654250840064 0 0 0 1381575766539369164864420818427111292018498732032 0 0 0 1427247692705959881058285969449495136382746624000 0 14 722 On 21 November 2014 22:11, Ryan Zezeski wrote: > > Oleksiy Krivoshey writes: > > > > > Still, what kind of base64 string can it be? I don't have anything base64 > > encoded in my data, its a pure JSON objects stored with content_type: > > 'application/json' > > The _yz_* fields (which need to be part of your schema and defined > exactly as defined in the default schema) are generated as part of > indexing. The entropy data field (_yz_ed) uses a base64 of the object > hash so that hashtrees may be rebuilt for the purpose of Active > Anti-Entropy (AAE). My guess is somehow this value if getting truncated > or corrupted along the way. > > https://github.com/basho/yokozuna/blob/develop/priv/default_schema.xml#L111 > > This code is only executed when rebuilding AAE trees. What is the output > from the following? > > riak-admin search aae-status > > -Z > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
I have really basic Erlang knowledge (just a few m/r tasks for Riak). But I definitely would like to try. Give me some information to so that I can start please. Btw, I also have the same problem on my local application setup (on a laptop) with a single Riak node. On 21 November 2014 22:30, Ryan Zezeski wrote: > > Oleksiy Krivoshey writes: > > > > > Entropy Trees > > > > Index Built (ago) > > > --- > > 11417981541647679048466287755595961091061972992-- > > 57089907708238395242331438777979805455309864960-- > > 102761833874829111436196589800363649819557756928 -- > > 148433760041419827630061740822747494183805648896 -- > > 194105686208010543823926891845131338548053540864 10.4 hr > > 239777612374601260017792042867515182912301432832 -- > > 285449538541191976211657193889899027276549324800 -- > > 650824947873917705762578402068969782190532460544 -- > > 696496874040508421956443553091353626554780352512 -- > > 742168800207099138150308704113737470919028244480 -- > > 787840726373689854344173855136121315283276136448 -- > > 833512652540280570538039006158505159647524028416 -- > > 879184578706871286731904157180889004011771920384 -- > > 924856504873462002925769308203272848376019812352 -- > > 970528431040052719119634459225656692740267704320 -- > > 1016200357206643435313499610248040537104515596288 -- > > 1061872283373234151507364761270424381468763488256 9.4 hr > > 1107544209539824867701229912292808225833011380224 -- > > 1153216135706415583895095063315192070197259272192 -- > > 119061873006300088960214337575914561507164160 12.1 hr > > 1244559988039597016282825365359959758925755056128 -- > > 1290231914206187732476690516382343603290002948096 -- > > 1335903840372778448670555667404727447654250840064 11.4 hr > > 1381575766539369164864420818427111292018498732032 -- > > 1427247692705959881058285969449495136382746624000 -- > > > > So it seems many of these trees are not building because of this issue. > The system will keep trying to build but it will fail every time because > of the bad base64 string. Trying to catch this with redbug will prove > difficult too because it automatically shuts itself off after X events. > That can be changed but then you have to dig through a mountain of > output. Not a fun way to do things. > > How comfortable are you with Erlang/Riak? Enough to write a bit of code > and hot-load it into your cluster? > > -Z > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
Got few results: I don't see anything wrong with the first record, but the second record mentions the key '/.Trash/MT03' which is not correct, the correct key that exists in that bucket is '/.Trash/MT03 348 plat frames' (riak@127.0.0.1)1> redbug:start("yz_solr:to_pair -> return", [{time, 60}]). {3243,1} 00:09:05 <0.5058.0>(dead) {yz_solr,to_pair, [{struct, [{<<"vsn">>,<<"1">>}, {<<"riak_bucket_type">>,<<"fs_files">>}, {<<"riak_bucket_name">>, <<"0fnn2pklvjm5bz37xmlkaxl8k0v0wmrc.files">>}, {<<"riak_key">>,<<"/aaa/bbb">>}, {<<"base64_hash">>,<<"g2IC6J4E">>}]}]} 00:09:05 <0.5058.0>(dead) yz_solr:to_pair/1 -> {{{<<"fs_files">>, <<"0fnn2pklvjm5bz37xmlkaxl8k0v0wmrc.files">>}, <<"/aaa/bbb">>}, <<131,98,2,232,158,4>>} 00:09:05 <0.5058.0>(dead) {yz_solr,to_pair, [{struct, [{<<"vsn">>,<<"1">>}, {<<"riak_bucket_type">>,<<"fs_files">>}, {<<"riak_bucket_name">>, <<"abtiuxzzzcyr6y6cat3fgvfuaff9veot.files">>}, {<<"riak_key">>,<<"/.Trash/MT03">>}, {<<"base64_hash">>,<<"348">>}]}]} 00:09:05 <0.5058.0>(dead) yz_solr:to_pair/1 -> {error,{badmatch,false}} On 21 November 2014 22:34, Oleksiy Krivoshey wrote: > I have really basic Erlang knowledge (just a few m/r tasks for Riak). But > I definitely would like to try. Give me some information to so that I can > start please. Btw, I also have the same problem on my local application > setup (on a laptop) with a single Riak node. > > On 21 November 2014 22:30, Ryan Zezeski wrote: > >> >> Oleksiy Krivoshey writes: >> >> > >> > Entropy Trees >> > >> > Index Built (ago) >> > >> --- >> > 11417981541647679048466287755595961091061972992-- >> > 57089907708238395242331438777979805455309864960-- >> > 102761833874829111436196589800363649819557756928 -- >> > 148433760041419827630061740822747494183805648896 -- >> > 194105686208010543823926891845131338548053540864 10.4 hr >> > 239777612374601260017792042867515182912301432832 -- >> > 285449538541191976211657193889899027276549324800 -- >> > 650824947873917705762578402068969782190532460544 -- >> > 696496874040508421956443553091353626554780352512 -- >> > 742168800207099138150308704113737470919028244480 -- >> > 787840726373689854344173855136121315283276136448 -- >> > 833512652540280570538039006158505159647524028416 -- >> > 879184578706871286731904157180889004011771920384 -- >> > 924856504873462002925769308203272848376019812352 -- >> > 970528431040052719119634459225656692740267704320 -- >> > 1016200357206643435313499610248040537104515596288 -- >> > 1061872283373234151507364761270424381468763488256 9.4 hr >> > 1107544209539824867701229912292808225833011380224 -- >> > 1153216135706415583895095063315192070197259272192 -- >> > 119061873006300088960214337575914561507164160 12.1 hr >> > 1244559988039597016282825365359959758925755056128 -- >> > 1290231914206187732476690516382343603290002948096 -- >> > 1335903840372778448670555667404727447654250840064 11.4 hr >> > 1381575766539369164864420818427111292018498732032 -- >> > 1427247692705959881058285969449495136382746624000 -- >> > >> >> So it seems many of these trees are not building because of this issue. >> The system will keep trying to build but it will fail every time because >> of the bad base64 string. Trying to catch this with redbug will prove >> difficult too because it automatically shuts itself off after X events. >> That can be changed but then you have to dig through a mountain of >> output. Not a fun way to do things. >> >> How comfortable are you with Erlang/Riak? Enough to write a bit of code >> and hot-load it into your cluster? >> >> -Z >> > > > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
Bucket backend is eleveldb. Bucket props are: '{"props":{"backend":"fs_files","allow_mult":"false","r":1,"notfound_ok":"false","basic_quorum":"true","search_index":"files_index"}}' fs_files backend: {<<"fs_files">>, riak_kv_eleveldb_backend, [ % {total_leveldb_mem, 8589934592}, {data_root, "./fs_files"} ]}, Thats the content of that key (/.Trash/MT03 348 plat frames): { value: { ctime: '2014-11-21T17:40:07.947Z', mtime: '2014-11-21T17:40:07.947Z', isDirectory: true }, content_type: 'application/json', vtag: '6gqpDUb5r2GWyUQO6JFVaB', last_mod: 1416591607, last_mod_usecs: 948073, indexes: [ { key: 'abtiuxzzzcyr6y6cat3fgvfuaff9veot.files_directory_bin', value: '/.Trash' } ] } There definitely never were any key as '/.Trash/MT03'. On 22 November 2014 00:14, Oleksiy Krivoshey wrote: > '/.Trash/MT03 348 plat frame -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
There are other keys, that start with the same prefix, e.g.: /.Trash/MT03 348 plat frames/MT03 348 plat 001.jpg /.Trash/MT03 348 plat frames/MT03 348 plat 002.jpg /.Trash/MT03 348 plat frames/MT03 348 plat 003.jpg On 22 November 2014 00:20, Oleksiy Krivoshey wrote: > Bucket backend is eleveldb. > > Bucket props > are: > '{"props":{"backend":"fs_files","allow_mult":"false","r":1,"notfound_ok":"false","basic_quorum":"true","search_index":"files_index"}}' > > fs_files backend: > > {<<"fs_files">>, riak_kv_eleveldb_backend, [ > % {total_leveldb_mem, 8589934592}, > {data_root, "./fs_files"} > ]}, > > Thats the content of that key (/.Trash/MT03 348 plat frames): > > { value: >{ ctime: '2014-11-21T17:40:07.947Z', > mtime: '2014-11-21T17:40:07.947Z', > isDirectory: true }, > content_type: 'application/json', > vtag: '6gqpDUb5r2GWyUQO6JFVaB', > last_mod: 1416591607, > last_mod_usecs: 948073, > indexes: >[ { key: 'abtiuxzzzcyr6y6cat3fgvfuaff9veot.files_directory_bin', >value: '/.Trash' } ] } > > There definitely never were any key as '/.Trash/MT03'. > > On 22 November 2014 00:14, Oleksiy Krivoshey wrote: > >> '/.Trash/MT03 348 plat frame > > > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
HI Ryan, Thanks! Unfortunately I can't do this on my live application. Is this going to be fixed? Should I submit an issue to Yokozuna project at github? On 22 November 2014 02:02, Ryan Zezeski wrote: > > Oleksiy Krivoshey writes: > > > Got few results: > > > > I don't see anything wrong with the first record, but the second record > > mentions the key '/.Trash/MT03' which is not correct, the correct key > that > > exists in that bucket is > > > > '/.Trash/MT03 348 plat frames' > > > > You have found a bug in Yokozuna. > > https://github.com/basho/yokozuna/blob/develop/src/yz_doc.erl#L230 > > > https://github.com/basho/yokozuna/blob/develop/java_src/com/basho/yokozuna/handler/EntropyData.java#L139 > > It foolishly assumes there is no space character used in the type, > bucket, or key names. As a workaround I think you'll have to make sure > your application converts all spaces to some other character (like > underscore) before storing in Riak. > > -Z > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna setup with many buckets
Hi Eric, I think I will be running with a 10-30 thousands buckets, with some of them almost empty (<10 keys) but some with hundreds of thousands of keys. I'm pretty sure I can stay away from hundreds of millions docs per node limit. Does it mean that I can simply grow my Riak cluster to split the index? On 22 November 2014 at 15:46, Eric Redmond wrote: > Oleksiy, > > Indexes have some overhead of their own, but they also have a reasonable > limit on doc count (hundreds of millions per node). To answer you question > requires a bit more knowledge of your use-case. One index can be more > efficient, as long as you're not creating hundreds of thousands of buckets. > On the other hand, you don't want to create hundreds of thousands of > indexes. Could you us give a bit more information of your expected numbers? > > Thanks, > Eric > > > On Nov 19, 2014, at 11:55 AM, Oleksiy Krivoshey > wrote: > > > Hi, > > > > Can anyone please suggest what will be the best setup of Yokozuna (I > mean indexing/search performance) if I have many buckets of the same > bucket-type: > > > > 1. having 1 yokozuna index associated with a bucket-type (e.g. with all > buckets) > > 2. having separate yokozuna index created and associated with each bucket > > > > In my setup I will always have to search within a single (specified) > bucket only so that search results from other buckets do not mix together. > > > > Thanks! > > > > -- > > Oleksiy > > _______ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna error during indexing
Thanks, Unfortunately I have few millions of keys in a live application and its almost impossible to update them all without stopping the service. And it can't be as simple as whitespace<->underscore replacement (what if there will be a real underscore in the name?). On 22 November 2014 at 16:15, Eric Redmond wrote: > Go ahead and submit an issue. We'll take a look at it, but it could be > months before a fix ends up in a release. I'd take Ryan's suggestion for > now and replace spaces with another char like underscore (_). > > Eric > > > On Nov 21, 2014, at 11:20 PM, Oleksiy Krivoshey > wrote: > > Bucket backend is eleveldb. > > Bucket props > are: > '{"props":{"backend":"fs_files","allow_mult":"false","r":1,"notfound_ok":"false","basic_quorum":"true","search_index":"files_index"}}' > > fs_files backend: > > {<<"fs_files">>, riak_kv_eleveldb_backend, [ > % {total_leveldb_mem, 8589934592}, > {data_root, "./fs_files"} > ]}, > > Thats the content of that key (/.Trash/MT03 348 plat frames): > > { value: >{ ctime: '2014-11-21T17:40:07.947Z', > mtime: '2014-11-21T17:40:07.947Z', > isDirectory: true }, > content_type: 'application/json', > vtag: '6gqpDUb5r2GWyUQO6JFVaB', > last_mod: 1416591607, > last_mod_usecs: 948073, > indexes: >[ { key: 'abtiuxzzzcyr6y6cat3fgvfuaff9veot.files_directory_bin', >value: '/.Trash' } ] } > > There definitely never were any key as '/.Trash/MT03'. > > On 22 November 2014 00:14, Oleksiy Krivoshey wrote: > >> '/.Trash/MT03 348 plat frame > > > > -- > Oleksiy Krivoshey > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Yokozuna - inconsistent number of documents found for the same query
Hi, I get inconsistent number of documents returned for the same search query when index keys are repaired by search AAE. The prerequisites are: 1. Create a bucket, insert some keys (10 keys - KeysA) 2. Create Yokozuna Index, associate it with the bucket 3. Add or update some new keys in the bucket (10 keys - KeysB) 4. Wait for Search AAE to build and exchange the trees Now when I issue a search query I will always get all 10 KeysB but a random amount of KeysA, for example the same query repeated 5 times may return: 10 KeysB + 2 KeysA 10 KeysB + 0 KeysA 10 KeysB + 7 KeysA 10 KeysB + 1 KeysA 10 KeysB + 10 KeysA Why is this happening? -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna - inconsistent number of documents found for the same query
Thank you! p.s. That github issue mentions 5 node cluster, however I can replicate it on a single node Riak setup. On 1 December 2014 at 20:26, Eric Redmond wrote: > This is a known issue, and we're still working on a fix. > > https://github.com/basho/yokozuna/issues/426 > > Eric > > > On Nov 29, 2014, at 9:26 AM, Oleksiy Krivoshey wrote: > > Hi, > > I get inconsistent number of documents returned for the same search query > when index keys are repaired by search AAE. The prerequisites are: > > 1. Create a bucket, insert some keys (10 keys - KeysA) > 2. Create Yokozuna Index, associate it with the bucket > 3. Add or update some new keys in the bucket (10 keys - KeysB) > 4. Wait for Search AAE to build and exchange the trees > > Now when I issue a search query I will always get all 10 KeysB but a > random amount of KeysA, for example the same query repeated 5 times may > return: > > 10 KeysB + 2 KeysA > 10 KeysB + 0 KeysA > 10 KeysB + 7 KeysA > 10 KeysB + 1 KeysA > 10 KeysB + 10 KeysA > > Why is this happening? > > -- > Oleksiy Krivoshey > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna - inconsistent number of documents found for the same query
Hi Ryan, There are no errors, 'search aae-status' shows that all trees have already been created and synced. When I first noticed this issue I was running a dev setup (single node) and had about 500 keys in total. I waited over 7 days and still was experiencing this issue. During these 7 days the AAE continued to repair keys according to logs, but 'search aae-status' was showing that all trees were already synced (and re-synced). I then created a test Riak setup (again single node) with just 20 keys. I replicated the problem, waited for 'aae-status' to show all synced trees (about 2 days) and wrote about the problem here. During these tests I _sometimes_ received full set of documents as stated in my original email - for me that was an indication that all keys have already been repaired. Now, 6 days after that I can no longer replicate it - I always get full set of documents.The last message from AAE key_exchange was logged yesterday (5 days after test setup was created). So it seems that for 20 keys it finally started working after 5-6 days. On 2 December 2014 at 06:57, Ryan Zezeski wrote: > > Eric Redmond writes: > > > This is a known issue, and we're still working on a fix. > > > > https://github.com/basho/yokozuna/issues/426 > > > > I don't see how this issue is related to Oleksiy's problem. There is no > mention of removing or adding nodes. I think the key part of Oleksiy's > report is the association of an index _after_ data had already been > written. That data is sometimes missing. These two issues could be > related but I don't see anything in that GitHub report to indicate why. > > > > > On Nov 29, 2014, at 9:26 AM, Oleksiy Krivoshey > wrote: > >>> > >> 1. Create a bucket, insert some keys (10 keys - KeysA) > >> 2. Create Yokozuna Index, associate it with the bucket > >> 3. Add or update some new keys in the bucket (10 keys - KeysB) > >> 4. Wait for Search AAE to build and exchange the trees > >> > >> Now when I issue a search query I will always get all 10 KeysB but a > random amount of KeysA, for example the same query repeated 5 times may > return: > >> > >> 10 KeysB + 2 KeysA > >> 10 KeysB + 0 KeysA > >> 10 KeysB + 7 KeysA > >> 10 KeysB + 1 KeysA > >> 10 KeysB + 10 KeysA > >> > > Are there any errors in the logs? Does the count go up if you wait > longer? What does `riak-admin search aae-status` show? > > -Z > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
confused by node_gets vs node_gets stats
Hi, Can someone please explain the strange behaviour I'm experiencing with node_gets stats vs vnode_gets on this histograms (screenshot): https://www.dropbox.com/s/yrd7wg5q4ipfvk0/Screenshot%202014-12-06%2014.16.13.png?dl=0 According to: http://docs.basho.com/riak/latest/ops/running/nodes/inspecting/ node_gets - Number of GETs coordinated by this node, including GETs to non-local vnodes in the last minute vnode_gets - Number of GET operations coordinated by local vnodes on this node in the last minute My questions which I don't understand: 1. How can vnode_gets be larger than node_gets by 50x on a 5 node cluster? If I understand the doc correctly, it can't be larger than sum of node_gets on all nodes? 2. I can't explain those periodic splashes in vnode_gets each hour. There are no such splashes on node_gets at the same periods and there is no periodic load caused by my application. What can be the reason? AAE? Thanks! -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: confused by node_gets vs node_gets stats
I guess those splashes were caused by AAE, because eventually it became normal: https://www.dropbox.com/s/zjemsub9astjhiw/Screenshot%202014-12-07%2017.36.45.png?dl=0 On 6 December 2014 at 14:27, Oleksiy Krivoshey wrote: > Hi, > > Can someone please explain the strange behaviour I'm experiencing with > node_gets stats vs vnode_gets on this histograms (screenshot): > > > https://www.dropbox.com/s/yrd7wg5q4ipfvk0/Screenshot%202014-12-06%2014.16.13.png?dl=0 > > According to: > http://docs.basho.com/riak/latest/ops/running/nodes/inspecting/ > > node_gets - Number of GETs coordinated by this node, including GETs to > non-local vnodes in the last minute > > vnode_gets - Number of GET operations coordinated by local vnodes on this > node in the last minute > > My questions which I don't understand: > > 1. How can vnode_gets be larger than node_gets by 50x on a 5 node cluster? > If I understand the doc correctly, it can't be larger than sum of node_gets > on all nodes? > > 2. I can't explain those periodic splashes in vnode_gets each hour. There > are no such splashes on node_gets at the same periods and there is no > periodic load caused by my application. What can be the reason? AAE? > > Thanks! > > -- > Oleksiy Krivoshey > -- Oleksiy Krivoshey ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Yokozuna inconsistent search results
Hi! Riak 2.1.3 Having a stable data set (no documents deleted in months) I'm receiving inconsistent search results with Yokozuna. For example first query can return num_found: 3000 (correct), the same query repeated in next seconds can return 2998, or 2995, then 3000 again. Similar inconsistency happens when trying to receive data in pages (using start/rows options): sometimes I get the same document twice (in different pages), sometimes some documents are missing completely. There are no errors or warning in Yokozuna logs. What should I look for in order to debug the problem? Thanks! ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
Yes, AAE is enabled: anti_entropy = active anti_entropy.use_background_manager = on handoff.use_background_manager = on anti_entropy.throttle.tier1.mailbox_size = 0 anti_entropy.throttle.tier1.delay = 5ms anti_entropy.throttle.tier2.mailbox_size = 50 anti_entropy.throttle.tier2.delay = 50ms anti_entropy.throttle.tier3.mailbox_size = 100 anti_entropy.throttle.tier3.delay = 500ms anti_entropy.throttle.tier4.mailbox_size = 200 anti_entropy.throttle.tier4.delay = 2000ms anti_entropy.throttle.tier5.mailbox_size = 500 anti_entropy.throttle.tier5.delay = 5000ms However the output of "riak-admin search aae-status" looks like this: http://oleksiy.sirv.com/misc/search-aae.png On Fri, 26 Feb 2016 at 17:13 Fred Dushin wrote: > I would check the coverage plans that are being used for the different > queries, which you can usually see in the headers of the resulting > document. When you run a search query though yokozuna, it will use a > coverage plan from riak core to find a minimal set of nodes (and > partitions) to query to get a set of results, and the coverage plan may > change every few seconds. You might be hitting nodes that have > inconsistencies or are in need of repair. Do you have AAE enabled? > > -Fred > > > On Feb 26, 2016, at 8:36 AM, Oleksiy Krivoshey > wrote: > > > > Hi! > > > > Riak 2.1.3 > > > > Having a stable data set (no documents deleted in months) I'm receiving > inconsistent search results with Yokozuna. For example first query can > return num_found: 3000 (correct), the same query repeated in next seconds > can return 2998, or 2995, then 3000 again. Similar inconsistency happens > when trying to receive data in pages (using start/rows options): sometimes > I get the same document twice (in different pages), sometimes some > documents are missing completely. > > > > There are no errors or warning in Yokozuna logs. What should I look for > in order to debug the problem? > > > > Thanks! > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
regarding the coverage plan, is there any way to get it with protocol buffers API? As RpbSearchQueryResp messages doesn't seem to contain anything but docs: message RpbSearchQueryResp { repeated RpbSearchDoc docs = 1; // Result documents optional floatmax_score = 2; // Maximum score optional uint32 num_found = 3; // Number of results } On Fri, 26 Feb 2016 at 17:51 Oleksiy Krivoshey wrote: > Yes, AAE is enabled: > > anti_entropy = active > > anti_entropy.use_background_manager = on > handoff.use_background_manager = on > > anti_entropy.throttle.tier1.mailbox_size = 0 > anti_entropy.throttle.tier1.delay = 5ms > > anti_entropy.throttle.tier2.mailbox_size = 50 > anti_entropy.throttle.tier2.delay = 50ms > > anti_entropy.throttle.tier3.mailbox_size = 100 > anti_entropy.throttle.tier3.delay = 500ms > > anti_entropy.throttle.tier4.mailbox_size = 200 > anti_entropy.throttle.tier4.delay = 2000ms > > anti_entropy.throttle.tier5.mailbox_size = 500 > anti_entropy.throttle.tier5.delay = 5000ms > > However the output of "riak-admin search aae-status" looks like this: > http://oleksiy.sirv.com/misc/search-aae.png > > > On Fri, 26 Feb 2016 at 17:13 Fred Dushin wrote: > >> I would check the coverage plans that are being used for the different >> queries, which you can usually see in the headers of the resulting >> document. When you run a search query though yokozuna, it will use a >> coverage plan from riak core to find a minimal set of nodes (and >> partitions) to query to get a set of results, and the coverage plan may >> change every few seconds. You might be hitting nodes that have >> inconsistencies or are in need of repair. Do you have AAE enabled? >> >> -Fred >> >> > On Feb 26, 2016, at 8:36 AM, Oleksiy Krivoshey >> wrote: >> > >> > Hi! >> > >> > Riak 2.1.3 >> > >> > Having a stable data set (no documents deleted in months) I'm receiving >> inconsistent search results with Yokozuna. For example first query can >> return num_found: 3000 (correct), the same query repeated in next seconds >> can return 2998, or 2995, then 3000 again. Similar inconsistency happens >> when trying to receive data in pages (using start/rows options): sometimes >> I get the same document twice (in different pages), sometimes some >> documents are missing completely. >> > >> > There are no errors or warning in Yokozuna logs. What should I look for >> in order to debug the problem? >> > >> > Thanks! >> > ___ >> > riak-users mailing list >> > riak-users@lists.basho.com >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Riak server doesn't start after crash
Riak 2.1.1 The following error is logged multiple times and then Riak shuts down: scan_key_files: error function_clause @ [{riak_kv_bitcask_backend,key_transform_to_1,[<<>>],[{file,"src/riak_kv_bitcask_backend.erl"},{line,99}]},{bitcask,'-scan_key_files/5-fun-0-',7,[{file,"src/bitcask.erl"},{line,1188}]},{bitcask_fileops,fold_keys_int_loop,5,[{file,"src/bitcask_fileops.erl"},{line,595}]},{bitcask_fileops,fold_file_loop,8,[{file,"src/bitcask_fileops.erl"},{line,720}]},{bitcask_fileops,fold_keys_loop,4,[{file,"src/bitcask_fileops.erl"},{line,575}]},{bitcask,scan_key_files,5,[{file,"src/bitcask.erl"},{line,1196}]},{bitcask,init_keydir_scan_key_files,4,[{file,"src/bitcask.erl"},{line,1289}]},{bitcask,init_keydir,4,[{file,"src/bitcask.erl"},{line,1241}]}] I guess its some kind of data corruption. How do I bring the node back? Thanks! ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Using $bucket index for listing keys
I have a bucket with ~200 keys in it and I wanted to iterate them with the help of $bucket index and 2i request, however I'm facing the recursive behaviour, for example I send the following 2i request: { bucket: 'BUCKET_NAME', type: 'BUCKET_TYPE', index: '$bucket', key: 'BUCKET_NAME', qtype: 0, max_results: 10, continuation: '' } I receive 10 keys and continuation '', I then repeat the request with continuation '' and at this point I can receive a reply with continuation '' or '' or even '' and its going in never ending recursion. I'm running this on a 5 node 2.1.3 cluster. What I'm doing wrong? Or is this not supported at all? Thanks! ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak server doesn't start after crash
I've updated Riak to 2.1.3 and it started successfully. On 28 February 2016 at 14:47, Oleksiy Krivoshey wrote: > Riak 2.1.1 > > The following error is logged multiple times and then Riak shuts down: > > scan_key_files: error function_clause @ > [{riak_kv_bitcask_backend,key_transform_to_1,[<<>>],[{file,"src/riak_kv_bitcask_backend.erl"},{line,99}]},{bitcask,'-scan_key_files/5-fun-0-',7,[{file,"src/bitcask.erl"},{line,1188}]},{bitcask_fileops,fold_keys_int_loop,5,[{file,"src/bitcask_fileops.erl"},{line,595}]},{bitcask_fileops,fold_file_loop,8,[{file,"src/bitcask_fileops.erl"},{line,720}]},{bitcask_fileops,fold_keys_loop,4,[{file,"src/bitcask_fileops.erl"},{line,575}]},{bitcask,scan_key_files,5,[{file,"src/bitcask.erl"},{line,1196}]},{bitcask,init_keydir_scan_key_files,4,[{file,"src/bitcask.erl"},{line,1289}]},{bitcask,init_keydir,4,[{file,"src/bitcask.erl"},{line,1241}]}] > > I guess its some kind of data corruption. > > How do I bring the node back? > > Thanks! > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
Hi Magnus, You are right, there was a Solr indexing issue: 2016-03-01 09:00:17,640 [ERROR] @SolrException.java:109 org.apache.solr.common.SolrException: Invalid Date String:'Invalid date' However I'm struggling to find the object that causes this, the error message doesn't contain the object id and the bucket is really huge. Can you suggest the way to find the object that causes Solr exception? Thanks! On 29 February 2016 at 10:28, Magnus Kessler wrote: > >> Hi Oleksiy, >> >> there are two partitions on the node that haven't seen their AAE tree >> rebuilt in a long time. The reason for this is not clear at the moment, >> although we have seen this happening when a partition contains data that >> for some reason cannot be indexed with the configured Solr schema. >> >> Please run on the 'riak attach' console: >> >> riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000). >> >> Afterwards exit the console with "Ctrl-G q". >> >> The AAE trees should start to be rebuilt shortly after. With default >> settings, on a cluster with ring size 64 the whole process should finish in >> about 1 day and any missing but indexable data should appear in all >> assigned Solr instances. During this time you may still observe >> inconsistent search results due to the way the coverage query is performed. >> Keep an eye open for any errors from one of the yz_ modules in the logs >> during this time. >> >> Please let us know if the Search AAE trees can be rebuilt successfully >> and if this solved your issue. >> >> Kind Regards, >> >> Magnus >> >> >> ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
I've found the keys in riak console.log within AAE errors. Thanks! On 5 March 2016 at 22:53, Oleksiy Krivoshey wrote: > 2016-03-01 09:00:17,640 [ERROR] > @SolrException.java:109 > org.apache.solr.common.SolrException: Invalid Date String:'Invalid date' > > However I'm struggling to find the object that causes this, the error > message doesn't contain the object id and the bucket is really huge. Can > you suggest the way to find the object that causes Solr exception? > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
Anyone? On 4 March 2016 at 19:11, Oleksiy Krivoshey wrote: > I have a bucket with ~200 keys in it and I wanted to iterate them with the > help of $bucket index and 2i request, however I'm facing the recursive > behaviour, for example I send the following 2i request: > > { > bucket: 'BUCKET_NAME', > type: 'BUCKET_TYPE', > index: '$bucket', > key: 'BUCKET_NAME', > qtype: 0, > max_results: 10, > continuation: '' > } > > I receive 10 keys and continuation '', I then repeat the request with > continuation '' and at this point I can receive a reply with > continuation '' or '' or even '' and its going in never ending > recursion. > > I'm running this on a 5 node 2.1.3 cluster. > > What I'm doing wrong? Or is this not supported at all? > > Thanks! > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
So event when I fixed 3 documents which caused AAE errors, restarted AAE with riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000). waited 5 days (now I see all AAE trees rebuilt in last 5 days and no AAE or Solr errors), I still get inconsistent num_found. For a bucket with 30,000 keys each new search request can result in difference in num_found for over 5,000. What else can I do to get consistent index, or at least not a 15% difference. I even tried to walk through all the bucket keys and modifying them in a hope that all Yokozuna instances in a cluster will pick them up, but no luck. Thanks! ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
Here are two consequent requests, one returns 30118 keys, another 37134 0 6 _yz_pn:92 OR _yz_pn:83 OR _yz_pn:71 OR _yz_pn:59 OR _yz_pn:50 OR _yz_pn:38 OR _yz_pn:17 OR _yz_pn:5 _yz_pn:122 OR _yz_pn:110 OR _yz_pn:98 OR _yz_pn:86 OR _yz_pn:74 OR _yz_pn:62 OR _yz_pn:26 OR _yz_pn:14 OR _yz_pn:2 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index _yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks (_yz_pn:124 AND (_yz_fpn:124 OR _yz_fpn:123)) OR _yz_pn:116 OR _yz_pn:104 OR _yz_pn:80 OR _yz_pn:68 OR _yz_pn:56 OR _yz_pn:44 OR _yz_pn:32 OR _yz_pn:20 OR _yz_pn:8 _yz_pn:113 OR _yz_pn:101 OR _yz_pn:89 OR _yz_pn:77 OR _yz_pn:65 OR _yz_pn:53 OR _yz_pn:41 OR _yz_pn:29 _yz_pn:127 OR _yz_pn:119 OR _yz_pn:107 OR _yz_pn:95 OR _yz_pn:47 OR _yz_pn:35 OR _yz_pn:23 OR _yz_pn:11 0 -- 0 10 _yz_pn:100 OR _yz_pn:88 OR _yz_pn:79 OR _yz_pn:67 OR _yz_pn:46 OR _yz_pn:34 OR _yz_pn:25 OR _yz_pn:13 OR _yz_pn:1 (_yz_pn:126 AND (_yz_fpn:126 OR _yz_fpn:125)) OR _yz_pn:118 OR _yz_pn:106 OR _yz_pn:94 OR _yz_pn:82 OR _yz_pn:70 OR _yz_pn:58 OR _yz_pn:22 OR _yz_pn:10 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index _yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks _yz_pn:124 OR _yz_pn:112 OR _yz_pn:76 OR _yz_pn:64 OR _yz_pn:52 OR _yz_pn:40 OR _yz_pn:28 OR _yz_pn:16 OR _yz_pn:4 _yz_pn:121 OR _yz_pn:109 OR _yz_pn:97 OR _yz_pn:85 OR _yz_pn:73 OR _yz_pn:61 OR _yz_pn:49 OR _yz_pn:37 _yz_pn:115 OR _yz_pn:103 OR _yz_pn:91 OR _yz_pn:55 OR _yz_pn:43 OR _yz_pn:31 OR _yz_pn:19 OR _yz_pn:7 0 On 11 March 2016 at 12:05, Oleksiy Krivoshey wrote: > So event when I fixed 3 documents which caused AAE errors, > restarted AAE with riak_core_util:rpc_every_member_ann(yz_entropy_mgr, > expire_trees, [], 5000). > waited 5 days (now I see all AAE trees rebuilt in last 5 days and no AAE > or Solr errors), I still get inconsistent num_found. > > For a bucket with 30,000 keys each new search request can result in > difference in num_found for over 5,000. > > What else can I do to get consistent index, or at least not a 15% > difference. > > I even tried to walk through all the bucket keys and modifying them in a > hope that all Yokozuna instances in a cluster will pick them up, but no > luck. > > Thanks! > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
Here is an example of the HTTP 2i query returning the same continuation as provided in query params: curl ' http://127.0.0.1:8098/types/fs_chunks/buckets/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks/index/$bucket/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks?max_results=10&continuation=g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU= ' {"keys":["4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:0","4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:2","FSEky50kr2TLkBuo1JKv6sphINYwnJfV:1","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:0","RToMNlsnVKvXcawQK6BGnCAKx58pC9xX:1","UMiHx4qDR5pHWT9OgLAu1KMlFeEKbISm:0","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:2","YQlRWkJPFYiLlAwhvgqOysJC3ycmQ9OA:0","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:15","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:25"],"continuation":"g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU="} How is that possible? On 11 March 2016 at 12:06, Oleksiy Krivoshey wrote: > Anyone? > > On 4 March 2016 at 19:11, Oleksiy Krivoshey wrote: > >> I have a bucket with ~200 keys in it and I wanted to iterate them with >> the help of $bucket index and 2i request, however I'm facing the recursive >> behaviour, for example I send the following 2i request: >> >> { >> bucket: 'BUCKET_NAME', >> type: 'BUCKET_TYPE', >> index: '$bucket', >> key: 'BUCKET_NAME', >> qtype: 0, >> max_results: 10, >> continuation: '' >> } >> >> I receive 10 keys and continuation '', I then repeat the request with >> continuation '' and at this point I can receive a reply with >> continuation '' or '' or even '' and its going in never ending >> recursion. >> >> I'm running this on a 5 node 2.1.3 cluster. >> >> What I'm doing wrong? Or is this not supported at all? >> >> Thanks! >> > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
I'm actually using PB interface, but I can replicate the problem with HTTP as in my previous email. Request with '&continuation=' returns the result set with the same continuation . On 11 March 2016 at 14:55, Magnus Kessler wrote: > Hi Oleksiy, > > How are you performing your 2i-based key listing? Querying with pagination > as shown in the documentation[0] should work. > > As an example here is the HTTP invocation: > > curl " > https://localhost:8098/types/default/buckets/test/index/\$bucket/_?max_results=10&continuation=g20CNTM= > " > > Once the end of the key list is reached, the server returns an empty keys > list and no further continuation value. > > Please let me know if this works for you. > > Kind Regards, > > Magnus > > > [0]: http://docs.basho.com/riak/latest/dev/using/2i/#Querying > > On 11 March 2016 at 10:06, Oleksiy Krivoshey wrote: > >> Anyone? >> >> On 4 March 2016 at 19:11, Oleksiy Krivoshey wrote: >> >>> I have a bucket with ~200 keys in it and I wanted to iterate them with >>> the help of $bucket index and 2i request, however I'm facing the recursive >>> behaviour, for example I send the following 2i request: >>> >>> { >>> bucket: 'BUCKET_NAME', >>> type: 'BUCKET_TYPE', >>> index: '$bucket', >>> key: 'BUCKET_NAME', >>> qtype: 0, >>> max_results: 10, >>> continuation: '' >>> } >>> >>> I receive 10 keys and continuation '', I then repeat the request >>> with continuation '' and at this point I can receive a reply with >>> continuation '' or '' or even '' and its going in never ending >>> recursion. >>> >>> I'm running this on a 5 node 2.1.3 cluster. >>> >>> What I'm doing wrong? Or is this not supported at all? >>> >>> Thanks! >>> >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > > -- > Magnus Kessler > Client Services Engineer > Basho Technologies Limited > > Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
Here it is without the `value` part of request: curl ' http://127.0.0.1:8098/types/fs_chunks/buckets/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks/index/$bucket/_?max_results=10&continuation=g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU= ' {"keys":["4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:0","4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:2","FSEky50kr2TLkBuo1JKv6sphINYwnJfV:1","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:0","RToMNlsnVKvXcawQK6BGnCAKx58pC9xX:1","UMiHx4qDR5pHWT9OgLAu1KMlFeEKbISm:0","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:2","YQlRWkJPFYiLlAwhvgqOysJC3ycmQ9OA:0","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:15","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:25"],"continuation":"g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU="} On 11 March 2016 at 14:58, Oleksiy Krivoshey wrote: > I'm actually using PB interface, but I can replicate the problem with HTTP > as in my previous email. Request with '&continuation=' returns the > result set with the same continuation . > > On 11 March 2016 at 14:55, Magnus Kessler wrote: > >> Hi Oleksiy, >> >> How are you performing your 2i-based key listing? Querying with >> pagination as shown in the documentation[0] should work. >> >> As an example here is the HTTP invocation: >> >> curl " >> https://localhost:8098/types/default/buckets/test/index/\$bucket/_?max_results=10&continuation=g20CNTM= >> " >> >> Once the end of the key list is reached, the server returns an empty keys >> list and no further continuation value. >> >> Please let me know if this works for you. >> >> Kind Regards, >> >> Magnus >> >> >> [0]: http://docs.basho.com/riak/latest/dev/using/2i/#Querying >> >> On 11 March 2016 at 10:06, Oleksiy Krivoshey wrote: >> >>> Anyone? >>> >>> On 4 March 2016 at 19:11, Oleksiy Krivoshey wrote: >>> >>>> I have a bucket with ~200 keys in it and I wanted to iterate them with >>>> the help of $bucket index and 2i request, however I'm facing the recursive >>>> behaviour, for example I send the following 2i request: >>>> >>>> { >>>> bucket: 'BUCKET_NAME', >>>> type: 'BUCKET_TYPE', >>>> index: '$bucket', >>>> key: 'BUCKET_NAME', >>>> qtype: 0, >>>> max_results: 10, >>>> continuation: '' >>>> } >>>> >>>> I receive 10 keys and continuation '', I then repeat the request >>>> with continuation '' and at this point I can receive a reply with >>>> continuation '' or '' or even '' and its going in never ending >>>> recursion. >>>> >>>> I'm running this on a 5 node 2.1.3 cluster. >>>> >>>> What I'm doing wrong? Or is this not supported at all? >>>> >>>> Thanks! >>>> >>> >>> >>> ___ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >> >> >> -- >> Magnus Kessler >> Client Services Engineer >> Basho Technologies Limited >> >> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 >> > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
Unfortunately there are just 200 keys in that bucket. So with larger max_results I just get all the keys without continuation. I'll try to replicate this with a bigger bucket. On Fri, Mar 11, 2016 at 15:21 Russell Brown wrote: > That seems very wrong. Can you do me a favour and try with a larger > max_results. I remember a bug with small results set, I thought it was > fixed, I’m looking into the past issues, but can you try “max_results=1000” > or something, and let me know what you see? > > On 11 Mar 2016, at 13:03, Oleksiy Krivoshey wrote: > > > Here it is without the `value` part of request: > > > > curl ' > http://127.0.0.1:8098/types/fs_chunks/buckets/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks/index/$bucket/_?max_results=10&continuation=g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU= > ' > > > > > {"keys":["4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:0","4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:2","FSEky50kr2TLkBuo1JKv6sphINYwnJfV:1","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:0","RToMNlsnVKvXcawQK6BGnCAKx58pC9xX:1","UMiHx4qDR5pHWT9OgLAu1KMlFeEKbISm:0","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:2","YQlRWkJPFYiLlAwhvgqOysJC3ycmQ9OA:0","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:15","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:25"],"continuation":"g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU="} > > > > On 11 March 2016 at 14:58, Oleksiy Krivoshey wrote: > > I'm actually using PB interface, but I can replicate the problem with > HTTP as in my previous email. Request with '&continuation=' returns the > result set with the same continuation . > > > > On 11 March 2016 at 14:55, Magnus Kessler wrote: > > Hi Oleksiy, > > > > How are you performing your 2i-based key listing? Querying with > pagination as shown in the documentation[0] should work. > > > > As an example here is the HTTP invocation: > > > > curl " > https://localhost:8098/types/default/buckets/test/index/\$bucket/_?max_results=10&continuation=g20CNTM= > " > > > > Once the end of the key list is reached, the server returns an empty > keys list and no further continuation value. > > > > Please let me know if this works for you. > > > > Kind Regards, > > > > Magnus > > > > > > [0]: http://docs.basho.com/riak/latest/dev/using/2i/#Querying > > > > On 11 March 2016 at 10:06, Oleksiy Krivoshey wrote: > > Anyone? > > > > On 4 March 2016 at 19:11, Oleksiy Krivoshey wrote: > > I have a bucket with ~200 keys in it and I wanted to iterate them with > the help of $bucket index and 2i request, however I'm facing the recursive > behaviour, for example I send the following 2i request: > > > > { > > bucket: 'BUCKET_NAME', > > type: 'BUCKET_TYPE', > > index: '$bucket', > > key: 'BUCKET_NAME', > > qtype: 0, > > max_results: 10, > > continuation: '' > > } > > > > I receive 10 keys and continuation '', I then repeat the request > with continuation '' and at this point I can receive a reply with > continuation '' or '' or even '' and its going in never ending > recursion. > > > > I'm running this on a 5 node 2.1.3 cluster. > > > > What I'm doing wrong? Or is this not supported at all? > > > > Thanks! > > > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > -- > > Magnus Kessler > > Client Services Engineer > > Basho Technologies Limited > > > > Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > > > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
I got the recursive behavior with other, larger buckets but I had no logging so when I enabled debugging this was the first bucket to replicate the problem. I have a lot of buckets of the same type, some have many thousands keys some are small. My task is to iterate the keys (once only) of all buckets. Either with 2i or with Yokozuna. On Fri, Mar 11, 2016 at 15:32 Russell Brown wrote: > Not the answer, by why pagination for 200 keys? Why the cost of doing the > query 20 times vs once? > > On 11 Mar 2016, at 13:28, Oleksiy Krivoshey wrote: > > > Unfortunately there are just 200 keys in that bucket. So with larger > max_results I just get all the keys without continuation. I'll try to > replicate this with a bigger bucket. > > On Fri, Mar 11, 2016 at 15:21 Russell Brown > wrote: > > That seems very wrong. Can you do me a favour and try with a larger > max_results. I remember a bug with small results set, I thought it was > fixed, I’m looking into the past issues, but can you try “max_results=1000” > or something, and let me know what you see? > > > > On 11 Mar 2016, at 13:03, Oleksiy Krivoshey wrote: > > > > > Here it is without the `value` part of request: > > > > > > curl ' > http://127.0.0.1:8098/types/fs_chunks/buckets/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks/index/$bucket/_?max_results=10&continuation=g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU= > ' > > > > > > > {"keys":["4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:0","4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:2","FSEky50kr2TLkBuo1JKv6sphINYwnJfV:1","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:0","RToMNlsnVKvXcawQK6BGnCAKx58pC9xX:1","UMiHx4qDR5pHWT9OgLAu1KMlFeEKbISm:0","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:2","YQlRWkJPFYiLlAwhvgqOysJC3ycmQ9OA:0","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:15","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:25"],"continuation":"g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU="} > > > > > > On 11 March 2016 at 14:58, Oleksiy Krivoshey > wrote: > > > I'm actually using PB interface, but I can replicate the problem with > HTTP as in my previous email. Request with '&continuation=' returns the > result set with the same continuation . > > > > > > On 11 March 2016 at 14:55, Magnus Kessler wrote: > > > Hi Oleksiy, > > > > > > How are you performing your 2i-based key listing? Querying with > pagination as shown in the documentation[0] should work. > > > > > > As an example here is the HTTP invocation: > > > > > > curl " > https://localhost:8098/types/default/buckets/test/index/\$bucket/_?max_results=10&continuation=g20CNTM= > " > > > > > > Once the end of the key list is reached, the server returns an empty > keys list and no further continuation value. > > > > > > Please let me know if this works for you. > > > > > > Kind Regards, > > > > > > Magnus > > > > > > > > > [0]: http://docs.basho.com/riak/latest/dev/using/2i/#Querying > > > > > > On 11 March 2016 at 10:06, Oleksiy Krivoshey > wrote: > > > Anyone? > > > > > > On 4 March 2016 at 19:11, Oleksiy Krivoshey > wrote: > > > I have a bucket with ~200 keys in it and I wanted to iterate them with > the help of $bucket index and 2i request, however I'm facing the recursive > behaviour, for example I send the following 2i request: > > > > > > { > > > bucket: 'BUCKET_NAME', > > > type: 'BUCKET_TYPE', > > > index: '$bucket', > > > key: 'BUCKET_NAME', > > > qtype: 0, > > > max_results: 10, > > > continuation: '' > > > } > > > > > > I receive 10 keys and continuation '', I then repeat the request > with continuation '' and at this point I can receive a reply with > continuation '' or '' or even '' and its going in never ending > recursion. > > > > > > I'm running this on a 5 node 2.1.3 cluster. > > > > > > What I'm doing wrong? Or is this not supported at all? > > > > > > Thanks! > > > > > > > > > ___ > > > riak-users mailing list > > > riak-users@lists.basho.com > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > -- > > > Magnus Kessler > > > Client Services Engineer > > > Basho Technologies Limited > > > > > > Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > > > > > > > > > ___ > > > riak-users mailing list > > > riak-users@lists.basho.com > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
Hi Fred, This is production environment but I can delete the index. However this index covers ~3500 buckets and there are probably 10,000,000 keys. The index was created after the buckets. The schema for the index is just the basic required fields (_yz_*) and nothing else. Yes, I'm willing to resolve this. When you say to delete chunks_index, do you mean the simple RpbYokozunaIndexDeleteReq or something else is required? Thanks! On 11 March 2016 at 17:08, Fred Dushin wrote: > Hi Oleksiy, > > This is definitely pointing to an issue either in the coverage plan (which > determines the distributed query you are seeing) or in the data you have in > Solr. I am wondering if it is possible that you have some data in Solr > that is causing the rebuild of the YZ AAE tree to incorrectly represent > what is actually stored in Solr. > > What you did was to manually expire the YZ (Riak Search) AAE trees, which > caused them to rebuild from the entropy data stored in Solr. Another thing > we could try (if you are willing) would be to delete the 'chunks_index' > data in Solr (as well as the Yokozuna AAE data), and then let AAE repair > the missing data. What Riak will essentially do is compare the KV hash > trees with the YZ hash trees (which will be empty), too that it is missing > in Solr, and add it to Solr, as a result. This would effectively result in > re-indexing all of your data, but we are only talking about ~30k entries > (times 3, presumably, if your n_val is 3), so that shouldn't take too much > time, I wouldn't think. There is even some configuration you can use to > accelerate this process, if necessary. > > Is that something you would be willing to try? It would result in down > time on query. Is this production data or a test environment? > > -Fred > > On Mar 11, 2016, at 7:38 AM, Oleksiy Krivoshey wrote: > > Here are two consequent requests, one returns 30118 keys, another 37134 > > > > > 0 > 6 > > _yz_pn:92 OR _yz_pn:83 OR _yz_pn:71 OR > _yz_pn:59 OR _yz_pn:50 OR _yz_pn:38 OR _yz_pn:17 OR _yz_pn:5 > _yz_pn:122 OR _yz_pn:110 OR _yz_pn:98 OR > _yz_pn:86 OR _yz_pn:74 OR _yz_pn:62 OR _yz_pn:26 OR _yz_pn:14 OR > _yz_pn:2 > > 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index > > _yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks > (_yz_pn:124 AND (_yz_fpn:124 OR > _yz_fpn:123)) OR _yz_pn:116 OR _yz_pn:104 OR _yz_pn:80 OR _yz_pn:68 OR > _yz_pn:56 OR _yz_pn:44 OR _yz_pn:32 OR _yz_pn:20 OR _yz_pn:8 > _yz_pn:113 OR _yz_pn:101 OR _yz_pn:89 OR > _yz_pn:77 OR _yz_pn:65 OR _yz_pn:53 OR _yz_pn:41 OR _yz_pn:29 > _yz_pn:127 OR _yz_pn:119 OR _yz_pn:107 OR > _yz_pn:95 OR _yz_pn:47 OR _yz_pn:35 OR _yz_pn:23 OR _yz_pn:11 > 0 > > >start="0"> > > > -- > > > > > > 0 > 10 > > _yz_pn:100 OR _yz_pn:88 OR _yz_pn:79 OR > _yz_pn:67 OR _yz_pn:46 OR _yz_pn:34 OR _yz_pn:25 OR _yz_pn:13 OR > _yz_pn:1 > (_yz_pn:126 AND (_yz_fpn:126 OR > _yz_fpn:125)) OR _yz_pn:118 OR _yz_pn:106 OR _yz_pn:94 OR _yz_pn:82 OR > _yz_pn:70 OR _yz_pn:58 OR _yz_pn:22 OR _yz_pn:10 > > 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index > > _yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks > _yz_pn:124 OR _yz_pn:112 OR _yz_pn:76 OR > _yz_pn:64 OR _yz_pn:52 OR _yz_pn:40 OR _yz_pn:28 OR _yz_pn:16 OR > _yz_pn:4 > _yz_pn:121 OR _yz_pn:109 OR _yz_pn:97 OR > _yz_pn:85 OR _yz_pn:73 OR _yz_pn:61 OR _yz_pn:49 OR _yz_pn:37 > _yz_pn:115 OR _yz_pn:103 OR _yz_pn:91 OR > _yz_pn:55 OR _yz_pn:43 OR _yz_pn:31 OR _yz_pn:19 OR _yz_pn:7 > 0 > > >start="0"> > > > On 11 March 2016 at 12:05, Oleksiy Krivoshey wrote: > >> So event when I fixed 3 documents which caused AAE errors, >> restarted AAE with riak_core_util:rpc_every_member_ann(yz_entropy_mgr, >> expire_trees, [], 5000). >> waited 5 days (now I see all AAE trees rebuilt in last 5 days and no AAE >> or Solr errors), I still get inconsistent num_found. >> >> For a bucket with 30,000 keys each new search request can result in >> difference in num_found for over 5,000. >> >> What else can I do to get consistent index, or at least not a 15% >> difference. >> >> I even tried to walk through all the bucket keys and modifying them in a >> hope that all
Re: Using $bucket index for listing keys
Hi Magnus, The bucket type has the following properties: '{"props":{"backend":"fs_chunks","allow_mult":"false","r":1,"notfound_ok":"false","basic_quorum":"false"}}' fs_chunks backend is configured as part of riak_kv_multi_backend: {<<"fs_chunks">>, riak_kv_bitcask_backend, [ {data_root, "/var/lib/riak/fs_chunks"} ]}, Objects stored are chunks of binary data (ContentType=binary/octet-stream) with a maximum size of 256Kb. Each object also has a single key/value pair in user metadata (usermeta). I'm not able to replicate this on my local single node setup, it only happens on a production 5 node cluster. On 11 March 2016 at 15:53, Magnus Kessler wrote: > Hi Oleksiy, > > Could you please share the bucket or bucket-type properties for that small > bucket? If you open an issue on github, please add the properties there, > too. > > Many Thanks, > > On 11 March 2016 at 13:46, Oleksiy Krivoshey wrote: > >> I got the recursive behavior with other, larger buckets but I had no >> logging so when I enabled debugging this was the first bucket to replicate >> the problem. I have a lot of buckets of the same type, some have many >> thousands keys some are small. My task is to iterate the keys (once only) >> of all buckets. Either with 2i or with Yokozuna. >> On Fri, Mar 11, 2016 at 15:32 Russell Brown wrote: >> >>> Not the answer, by why pagination for 200 keys? Why the cost of doing >>> the query 20 times vs once? >>> >>> On 11 Mar 2016, at 13:28, Oleksiy Krivoshey wrote: >>> >>> > Unfortunately there are just 200 keys in that bucket. So with larger >>> max_results I just get all the keys without continuation. I'll try to >>> replicate this with a bigger bucket. >>> > On Fri, Mar 11, 2016 at 15:21 Russell Brown >>> wrote: >>> > That seems very wrong. Can you do me a favour and try with a larger >>> max_results. I remember a bug with small results set, I thought it was >>> fixed, I’m looking into the past issues, but can you try “max_results=1000” >>> or something, and let me know what you see? >>> > >>> > On 11 Mar 2016, at 13:03, Oleksiy Krivoshey >>> wrote: >>> > >>> > > Here it is without the `value` part of request: >>> > > >>> > > curl ' >>> http://127.0.0.1:8098/types/fs_chunks/buckets/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks/index/$bucket/_?max_results=10&continuation=g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU= >>> ' >>> > > >>> > > >>> {"keys":["4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:0","4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:2","FSEky50kr2TLkBuo1JKv6sphINYwnJfV:1","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:0","RToMNlsnVKvXcawQK6BGnCAKx58pC9xX:1","UMiHx4qDR5pHWT9OgLAu1KMlFeEKbISm:0","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:2","YQlRWkJPFYiLlAwhvgqOysJC3ycmQ9OA:0","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:15","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:25"],"continuation":"g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU="} >>> > > >>> > > On 11 March 2016 at 14:58, Oleksiy Krivoshey >>> wrote: >>> > > I'm actually using PB interface, but I can replicate the problem >>> with HTTP as in my previous email. Request with '&continuation=' >>> returns the result set with the same continuation . >>> > > >>> > > On 11 March 2016 at 14:55, Magnus Kessler >>> wrote: >>> > > Hi Oleksiy, >>> > > >>> > > How are you performing your 2i-based key listing? Querying with >>> pagination as shown in the documentation[0] should work. >>> > > >>> > > As an example here is the HTTP invocation: >>> > > >>> > > curl " >>> https://localhost:8098/types/default/buckets/test/index/\$bucket/_?max_results=10&continuation=g20CNTM= >>> " >>> > > >>> > > Once the end of the key list is reached, the server returns an empty >>> keys list and no further continuation value. >>> > > >>> > > Please let me know if this works for you. >>> > > >>> > > Kind Regards, >>> > > >>> > > Magnus >>> > > >>
Re: Using $bucket index for listing keys
I know that 2i requires leveldb, but I'm not using my custom index, I'm using $bucket, I thought $bucket index is some kind of special internal index? On 11 March 2016 at 18:28, Russell Brown wrote: > You can’t…can you? I mean, 2i requires level. It requires ordered keys. > Which explains your problem, but you should have failed a lot earlier. > > On 11 Mar 2016, at 16:26, Oleksiy Krivoshey wrote: > > > Hi Magnus, > > > > The bucket type has the following properties: > > > > > '{"props":{"backend":"fs_chunks","allow_mult":"false","r":1,"notfound_ok":"false","basic_quorum":"false"}}' > > > > fs_chunks backend is configured as part of riak_kv_multi_backend: > > > > {<<"fs_chunks">>, riak_kv_bitcask_backend, [ > > {data_root, "/var/lib/riak/fs_chunks"} > > ]}, > > > > Objects stored are chunks of binary data > (ContentType=binary/octet-stream) with a maximum size of 256Kb. Each object > also has a single key/value pair in user metadata (usermeta). > > > > I'm not able to replicate this on my local single node setup, it only > happens on a production 5 node cluster. > > > > > > On 11 March 2016 at 15:53, Magnus Kessler wrote: > > Hi Oleksiy, > > > > Could you please share the bucket or bucket-type properties for that > small bucket? If you open an issue on github, please add the properties > there, too. > > > > Many Thanks, > > > > On 11 March 2016 at 13:46, Oleksiy Krivoshey wrote: > > I got the recursive behavior with other, larger buckets but I had no > logging so when I enabled debugging this was the first bucket to replicate > the problem. I have a lot of buckets of the same type, some have many > thousands keys some are small. My task is to iterate the keys (once only) > of all buckets. Either with 2i or with Yokozuna. > > On Fri, Mar 11, 2016 at 15:32 Russell Brown > wrote: > > Not the answer, by why pagination for 200 keys? Why the cost of doing > the query 20 times vs once? > > > > On 11 Mar 2016, at 13:28, Oleksiy Krivoshey wrote: > > > > > Unfortunately there are just 200 keys in that bucket. So with larger > max_results I just get all the keys without continuation. I'll try to > replicate this with a bigger bucket. > > > On Fri, Mar 11, 2016 at 15:21 Russell Brown > wrote: > > > That seems very wrong. Can you do me a favour and try with a larger > max_results. I remember a bug with small results set, I thought it was > fixed, I’m looking into the past issues, but can you try “max_results=1000” > or something, and let me know what you see? > > > > > > On 11 Mar 2016, at 13:03, Oleksiy Krivoshey > wrote: > > > > > > > Here it is without the `value` part of request: > > > > > > > > curl ' > http://127.0.0.1:8098/types/fs_chunks/buckets/0r0e5wahrhsgpolk9stbnrqmp77fjjye.chunks/index/$bucket/_?max_results=10&continuation=g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU= > ' > > > > > > > > > {"keys":["4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:0","4rpG2PwRTs3YqasGGYrhACBvZqTg7mQW:2","FSEky50kr2TLkBuo1JKv6sphINYwnJfV:1","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:0","RToMNlsnVKvXcawQK6BGnCAKx58pC9xX:1","UMiHx4qDR5pHWT9OgLAu1KMlFeEKbISm:0","F3KcwtjG9VAtM5u8vbwBuCjuGBrPTnfq:2","YQlRWkJPFYiLlAwhvgqOysJC3ycmQ9OA:0","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:15","kP3w2p9zXqZ2oAx48S1SgEJAlbtfHUvI:25"],"continuation":"g20ja1AzdzJwOXpYcVoyb0F4NDhTMVNnRUpBbGJ0ZkhVdkk6MjU="} > > > > > > > > On 11 March 2016 at 14:58, Oleksiy Krivoshey > wrote: > > > > I'm actually using PB interface, but I can replicate the problem > with HTTP as in my previous email. Request with '&continuation=' > returns the result set with the same continuation . > > > > > > > > On 11 March 2016 at 14:55, Magnus Kessler > wrote: > > > > Hi Oleksiy, > > > > > > > > How are you performing your 2i-based key listing? Querying with > pagination as shown in the documentation[0] should work. > > > > > > > > As an example here is the HTTP invocation: > > > > > > > > curl " > https://localhost:8098/types/default/buckets/test/index/\$bucket/_?max_results=10&continuation=g20CNTM= > " > > > > > > > > Once the end of the key li
Re: Using $bucket index for listing keys
The problem is not a problem then :) Thanks! ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
Hi Magnus, Yes, thank you. Though I'm stuck a bit. When I experienced recursive behaviour with paginated $bucket query I tried to create Yokozuna Solr index with an empty schema (just _yz_* fields). Unfortunately querying Yokozuna gives me inconsistent number of results as described in my another email in this mailing list. Hopefully it will be solved as well :) On 11 March 2016 at 18:44, Magnus Kessler wrote: > Hi Oleksiy, > > As Russel pointed out, 2i queries, including $bucket queries, are only > supported when the backend supports ordered keys. This is currently not the > case with bitcask. > > It appears, though, that you have discovered a bug where the multi-backend > module accepts the query despite the fact that the actually configured > backend for the bucket(-type) cannot support the query. I'll make the > engineering team aware of this. > > At this point in time I can only recommend that you do not use the $bucket > query with your configuration and use an alternative, such as Solr-based > search instead. > > Kind Regards, > > Magnus > > -- > Magnus Kessler > Client Services Engineer > Basho Technologies Limited > > Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Using $bucket index for listing keys
Because otherwise I have no ways to walk all the keys in some of my largest buckets, paginated query not supported, RpbListKeysReq fails with Riak 'timeout' error. All hopes for Yokozuna then! On 11 March 2016 at 18:49, Oleksiy Krivoshey wrote: > Hi Magnus, > > Yes, thank you. Though I'm stuck a bit. When I experienced recursive > behaviour with paginated $bucket query I tried to create Yokozuna Solr > index with an empty schema (just _yz_* fields). Unfortunately querying > Yokozuna gives me inconsistent number of results as described in my another > email in this mailing list. Hopefully it will be solved as well :) > > On 11 March 2016 at 18:44, Magnus Kessler wrote: > >> Hi Oleksiy, >> >> As Russel pointed out, 2i queries, including $bucket queries, are only >> supported when the backend supports ordered keys. This is currently not the >> case with bitcask. >> >> It appears, though, that you have discovered a bug where the >> multi-backend module accepts the query despite the fact that the actually >> configured backend for the bucket(-type) cannot support the query. I'll >> make the engineering team aware of this. >> >> At this point in time I can only recommend that you do not use the >> $bucket query with your configuration and use an alternative, such as >> Solr-based search instead. >> >> Kind Regards, >> >> Magnus >> >> -- >> Magnus Kessler >> Client Services Engineer >> Basho Technologies Limited >> >> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 >> > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
I would like to continue as this seems to me like a serious problem, on a bucket with 700,000 keys the difference in num_found can be up to 200,000! And thats a search index that doesn't index, analyse or store ANY of the document fields, the schema has only required _yz_* fields and nothing else. I have tried deleting the search index (with PBC call) and tried expiring AAE trees. Nothing helps. I can't get consistent search results from Yokozuna. Please help. On 11 March 2016 at 18:18, Oleksiy Krivoshey wrote: > Hi Fred, > > This is production environment but I can delete the index. However this > index covers ~3500 buckets and there are probably 10,000,000 keys. > > The index was created after the buckets. The schema for the index is just > the basic required fields (_yz_*) and nothing else. > > Yes, I'm willing to resolve this. When you say to delete chunks_index, do > you mean the simple RpbYokozunaIndexDeleteReq or something else is required? > > Thanks! > > > > > On 11 March 2016 at 17:08, Fred Dushin wrote: > >> Hi Oleksiy, >> >> This is definitely pointing to an issue either in the coverage plan >> (which determines the distributed query you are seeing) or in the data you >> have in Solr. I am wondering if it is possible that you have some data in >> Solr that is causing the rebuild of the YZ AAE tree to incorrectly >> represent what is actually stored in Solr. >> >> What you did was to manually expire the YZ (Riak Search) AAE trees, which >> caused them to rebuild from the entropy data stored in Solr. Another thing >> we could try (if you are willing) would be to delete the 'chunks_index' >> data in Solr (as well as the Yokozuna AAE data), and then let AAE repair >> the missing data. What Riak will essentially do is compare the KV hash >> trees with the YZ hash trees (which will be empty), too that it is missing >> in Solr, and add it to Solr, as a result. This would effectively result in >> re-indexing all of your data, but we are only talking about ~30k entries >> (times 3, presumably, if your n_val is 3), so that shouldn't take too much >> time, I wouldn't think. There is even some configuration you can use to >> accelerate this process, if necessary. >> >> Is that something you would be willing to try? It would result in down >> time on query. Is this production data or a test environment? >> >> -Fred >> >> On Mar 11, 2016, at 7:38 AM, Oleksiy Krivoshey >> wrote: >> >> Here are two consequent requests, one returns 30118 keys, another 37134 >> >> >> >> >> 0 >> 6 >> >> _yz_pn:92 OR _yz_pn:83 OR _yz_pn:71 OR >> _yz_pn:59 OR _yz_pn:50 OR _yz_pn:38 OR _yz_pn:17 OR _yz_pn:5 >> _yz_pn:122 OR _yz_pn:110 OR _yz_pn:98 OR >> _yz_pn:86 OR _yz_pn:74 OR _yz_pn:62 OR _yz_pn:26 OR _yz_pn:14 OR >> _yz_pn:2 >> >> 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index >> >> _yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks >> (_yz_pn:124 AND (_yz_fpn:124 OR >> _yz_fpn:123)) OR _yz_pn:116 OR _yz_pn:104 OR _yz_pn:80 OR _yz_pn:68 OR >> _yz_pn:56 OR _yz_pn:44 OR _yz_pn:32 OR _yz_pn:20 OR _yz_pn:8 >> _yz_pn:113 OR _yz_pn:101 OR _yz_pn:89 OR >> _yz_pn:77 OR _yz_pn:65 OR _yz_pn:53 OR _yz_pn:41 OR _yz_pn:29 >> _yz_pn:127 OR _yz_pn:119 OR _yz_pn:107 >> OR _yz_pn:95 OR _yz_pn:47 OR _yz_pn:35 OR _yz_pn:23 OR _yz_pn:11 >> 0 >> >> >> > start="0"> >> >> >> -- >> >> >> >> >> >> 0 >> 10 >> >> _yz_pn:100 OR _yz_pn:88 OR _yz_pn:79 OR >> _yz_pn:67 OR _yz_pn:46 OR _yz_pn:34 OR _yz_pn:25 OR _yz_pn:13 OR >> _yz_pn:1 >> (_yz_pn:126 AND (_yz_fpn:126 OR >> _yz_fpn:125)) OR _yz_pn:118 OR _yz_pn:106 OR _yz_pn:94 OR _yz_pn:82 OR >> _yz_pn:70 OR _yz_pn:58 OR _yz_pn:22 OR _yz_pn:10 >> >> 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index >> >> _yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks >> _yz_pn:124 OR _yz_pn:112 OR _yz_pn:76 OR >> _yz_pn:64 OR _yz_pn:52 OR _yz_pn:40 OR _yz_pn:28 OR _yz_pn:16 OR >> _yz_pn:4 >> _yz_pn:121 OR _yz_pn:109 OR _yz_pn:97 OR >> _yz_pn:85 OR _yz_pn:73 OR _yz_pn:61 OR _yz_pn:49 O
Re: Yokozuna inconsistent search results
This is how things are looking after two weeks: - there are no solr indexing issues for a long period (2 weeks) - there are no yokozuna errors at all for 2 weeks - there is an index with all empty schema, just _yz_* fields, objects stored in a bucket(s) are binary and so are not analysed by yokozuna - same yokozuna query repeated gives different number for num_found, typically the difference between real number of keys in a bucket and num_found is about 25% - number of keys repaired by AAE (according to logs) is about 1-2 per few hours (number of keys "missing" in index is close to 1,000,000) Should I now try to delete the index and yokozuna AAE data and wait another 2 weeks? If yes - how should I delete the index and AAE data? Will RpbYokozunaIndexDeleteReq be enough? On 18 March 2016 at 18:54, Oleksiy Krivoshey wrote: > Hi Magnus, > > As of today I had no Yokozuna messages (in sol.log) for 5 days. In Riak > log I can see some small amount of keys repaired by AAE: > > @riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during active > anti-entropy exchange of {148433760041419827630061740822747494183805648896,3} > between {159851741583067506678528028578343455274867621888,'riak@10.0.1.4'} > and {171269723124715185726994316333939416365929594880,'riak@10.0.1.5'} > > However amount of such repaired keys is not larger than 1-2 keys per hour, > while amount of keys missing in search index (for different buckets of the > same fs_chunks type) is close to 1,000,000. > > Thanks! > > On 16 March 2016 at 21:49, Oleksiy Krivoshey wrote: > >> Hi Magnus, >> >> I don't see any Solr indexing errors anymore. As well as no timeout >> errors (I've throttled down the application that was querying Yokozuna). >> >> I've attached `search aae-status` results from all nodes. >> >> Few days ago I saw the following error (few of them) from Yokozuna Solr >> (not sure if it got to the riak-debug output): >> >> @SolrException.java:120 IO error while trying to get >> the size of the Directory:java.io.FileNotFoundException: _2hky_1.del at >> org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261) at >> org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:178) >> at org.apache.solr.core.DirectoryFactory.sizeOf(DirectoryFactory.java:209) >> at >> org.apache.solr.core.DirectoryFactory.sizeOfDirectory(DirectoryFactory.java:195) >> at >> org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:1129) >> at >> org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:1105) >> at >> org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:705) >> at >> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:167) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) >> at >> org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:732) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:268) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) >> >> >> Should I delete Yokozuna AAE trees? >> >> Thanks! >> >> >> On 16 March 2016 at 17:51, Magnus Kessler wrote: >> >>> Hi Oleksiy, >>> >>> Whether or not 'brod' has anything to do with the failure to eventually >>> reach consistency in the Solr indexes remains to be seen. I just had never >>> come across this application in a Riak context and wanted to establish why >>> there appear to be frequent errors in the Riak logs associated with it. As >>> the brod application is not in use, could you temporarily remove it from >>> the erlang module path while you troubleshoot Solr indexing? >>> >>> We recommend having a catch-all field in all Solr schemas, as this >>> prevents indexing failures. However, you are correct that arbitrary mime >>> types are not analysed unless they have a Yokozuna extractor associated >>> with them. By default the yz_noop_extractor is used. >>> >>> We do not have an easy way to query for missing keys that should be >>> indexed. The necessary code is obviously part of Yokozuna's AAE repair >>> mechanism, but there is currently no tool that exposes this functionality >>> to the end user. This is why one re-indexing strategy is based on deleting >>> the Yokozuna AAE trees, which causes a rebuild of these trees and checks if >>> the appropriate entries
Re: Yokozuna inconsistent search results
Hi Magnus, Thanks! I guess I will go with index deletion because I've already tried expiring the trees before. Do I need to delete AAE data somehow or removing the index is enough? On 24 March 2016 at 13:28, Magnus Kessler wrote: > Hi Oleksiy, > > As a first step, I suggest to simply expire the Yokozuna AAE trees again > if the output of `riak-admin search aae-status` still suggests that no > recent exchanges have taken place. To do this, run `riak attach` on one > node and then > > riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000). > > > Exit from the riak console with `Ctrl+G q`. > > Depending on your settings and amount of data the full index should be > rebuilt within the next 2.5 days (for a cluster with ring size 128 and > default settings). You can monitor the progress with `riak-admin search > aae-status` and also in the logs, which should have messages along the > lines of > > 2016-03-24 10:28:25.372 [info] > <0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during > active anti-entropy exchange of partition > 1210306043414653979137426502093171875652569137152 for preflist > {1164634117248063262943561351070788031288321245184,3} > > > Re-indexing can put additional strain on the cluster and may cause > elevated latency on a cluster already under heavy load. Please monitor the > response times while the cluster is re-indexing data. > > If the cluster load allows it, you can force more rapid re-indexing by > changing a few parameters. Again at the `riak attach` console, run > > riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, > anti_entropy_build_limit, {4, 6}], 5000). > riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, > anti_entropy_concurrency, 5], 5000). > > This will allow up to 4 trees per node to be built/exchanged per hour, > with up to 5 concurrent exchanges throughout the cluster. To return back to > the default settings, use > > riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, > anti_entropy_build_limit, {1, 36}], 5000). > riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, > anti_entropy_concurrency, 2], 5000). > > > If the cluster still doesn't make any progress with automatically > re-indexing data, the next steps are pretty much what you already > suggested, to drop the existing index and re-index from scratch. I'm > assuming that losing the indexes temporarily is acceptable to you at this > point. > > Using any client API that supports RpbYokozunaIndexDeleteReq, you can > drop the index from all Solr instances, losing any data stored there > immediately. Next, you'll have to re-create the index. I have tried this > with the python API, where I deleted the index and re-created it with the > same already uploaded schema: > > from riak import RiakClient > > c = RiakClient() > c.delete_search_index('my_index') > c.create_search_index('my_index', 'my_schema') > > Note that simply deleting the index does not remove it's existing > association with any bucket or bucket type. Any PUT operations on these > buckets will lead to indexing failures being logged until the index has > been recreated. However, this also means that no separate operation in > `riak-admin` is required to associate the newly recreated index with the > buckets again. > > After recreating the index expire the trees as explained previously. > > Let us know if this solves your issue. > > Kind Regards, > > Magnus > > > On 24 March 2016 at 08:44, Oleksiy Krivoshey wrote: > >> This is how things are looking after two weeks: >> >> - there are no solr indexing issues for a long period (2 weeks) >> - there are no yokozuna errors at all for 2 weeks >> - there is an index with all empty schema, just _yz_* fields, objects >> stored in a bucket(s) are binary and so are not analysed by yokozuna >> - same yokozuna query repeated gives different number for num_found, >> typically the difference between real number of keys in a bucket and >> num_found is about 25% >> - number of keys repaired by AAE (according to logs) is about 1-2 per few >> hours (number of keys "missing" in index is close to 1,000,000) >> >> Should I now try to delete the index and yokozuna AAE data and wait >> another 2 weeks? If yes - how should I delete the index and AAE data? >> Will RpbYokozunaIndexDeleteReq be enough? >> >> >> > -- > Magnus Kessler > Client Services Engineer > Basho Technologies Limited > > Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Yokozuna inconsistent search results
OK! On 24 March 2016 at 21:11, Magnus Kessler wrote: > Hi Oleksiy, > > On 24 March 2016 at 14:55, Oleksiy Krivoshey wrote: > >> Hi Magnus, >> >> Thanks! I guess I will go with index deletion because I've already tried >> expiring the trees before. >> >> Do I need to delete AAE data somehow or removing the index is enough? >> > > If you expire the AAE trees with the commands I posted earlier, there > should be no need to remove the AAE data directories manually. > > I hope this works for you. Please monitor the tree rebuild and exchanges > with `riak-admin search aae-status` for the next few days. In particular > the exchanges should be ongoing on a continuous basis once all trees have > been rebuilt. If they don't, please let me know. At that point you should > also gather `riak-debug` output from all nodes before it gets rotated out > after 5 days by default. > > Kind Regards, > > Magnus > > >> >> On 24 March 2016 at 13:28, Magnus Kessler wrote: >> >>> Hi Oleksiy, >>> >>> As a first step, I suggest to simply expire the Yokozuna AAE trees again >>> if the output of `riak-admin search aae-status` still suggests that no >>> recent exchanges have taken place. To do this, run `riak attach` on one >>> node and then >>> >>> riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000). >>> >>> >>> Exit from the riak console with `Ctrl+G q`. >>> >>> Depending on your settings and amount of data the full index should be >>> rebuilt within the next 2.5 days (for a cluster with ring size 128 and >>> default settings). You can monitor the progress with `riak-admin search >>> aae-status` and also in the logs, which should have messages along the >>> lines of >>> >>> 2016-03-24 10:28:25.372 [info] >>> <0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during >>> active anti-entropy exchange of partition >>> 1210306043414653979137426502093171875652569137152 for preflist >>> {1164634117248063262943561351070788031288321245184,3} >>> >>> >>> Re-indexing can put additional strain on the cluster and may cause >>> elevated latency on a cluster already under heavy load. Please monitor the >>> response times while the cluster is re-indexing data. >>> >>> If the cluster load allows it, you can force more rapid re-indexing by >>> changing a few parameters. Again at the `riak attach` console, run >>> >>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>> anti_entropy_build_limit, {4, 6}], 5000). >>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>> anti_entropy_concurrency, 5], 5000). >>> >>> This will allow up to 4 trees per node to be built/exchanged per hour, >>> with up to 5 concurrent exchanges throughout the cluster. To return back to >>> the default settings, use >>> >>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>> anti_entropy_build_limit, {1, 36}], 5000). >>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>> anti_entropy_concurrency, 2], 5000). >>> >>> >>> If the cluster still doesn't make any progress with automatically >>> re-indexing data, the next steps are pretty much what you already >>> suggested, to drop the existing index and re-index from scratch. I'm >>> assuming that losing the indexes temporarily is acceptable to you at this >>> point. >>> >>> Using any client API that supports RpbYokozunaIndexDeleteReq, you can >>> drop the index from all Solr instances, losing any data stored there >>> immediately. Next, you'll have to re-create the index. I have tried this >>> with the python API, where I deleted the index and re-created it with the >>> same already uploaded schema: >>> >>> from riak import RiakClient >>> >>> c = RiakClient() >>> c.delete_search_index('my_index') >>> c.create_search_index('my_index', 'my_schema') >>> >>> Note that simply deleting the index does not remove it's existing >>> association with any bucket or bucket type. Any PUT operations on these >>> buckets will lead to indexing failures being logged until the index has >>> been recreated. However, this also means that no separate operation in >>> `riak-admin` is required to associate the newly re
Re: Yokozuna inconsistent search results
One interesting moment happened when I tried removing the index: - this index was associated with a bucket type, called fs_chunks - so I first called RpbSetBucketTypeReq to set search_index: _dont_index_ - i then tried to remove the index with RpbYokozunaIndexDeleteReq which failed with "index is in use" and list of all buckets of the fs_chunks type - for some reason all these buckets had their own search_index property set to that same index How can this happen if I definitely never set the search_index property per bucket? On 24 March 2016 at 22:41, Oleksiy Krivoshey wrote: > OK! > > On 24 March 2016 at 21:11, Magnus Kessler wrote: > >> Hi Oleksiy, >> >> On 24 March 2016 at 14:55, Oleksiy Krivoshey wrote: >> >>> Hi Magnus, >>> >>> Thanks! I guess I will go with index deletion because I've already tried >>> expiring the trees before. >>> >>> Do I need to delete AAE data somehow or removing the index is enough? >>> >> >> If you expire the AAE trees with the commands I posted earlier, there >> should be no need to remove the AAE data directories manually. >> >> I hope this works for you. Please monitor the tree rebuild and exchanges >> with `riak-admin search aae-status` for the next few days. In particular >> the exchanges should be ongoing on a continuous basis once all trees have >> been rebuilt. If they don't, please let me know. At that point you should >> also gather `riak-debug` output from all nodes before it gets rotated out >> after 5 days by default. >> >> Kind Regards, >> >> Magnus >> >> >>> >>> On 24 March 2016 at 13:28, Magnus Kessler wrote: >>> >>>> Hi Oleksiy, >>>> >>>> As a first step, I suggest to simply expire the Yokozuna AAE trees >>>> again if the output of `riak-admin search aae-status` still suggests that >>>> no recent exchanges have taken place. To do this, run `riak attach` on one >>>> node and then >>>> >>>> riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], >>>> 5000). >>>> >>>> >>>> Exit from the riak console with `Ctrl+G q`. >>>> >>>> Depending on your settings and amount of data the full index should be >>>> rebuilt within the next 2.5 days (for a cluster with ring size 128 and >>>> default settings). You can monitor the progress with `riak-admin search >>>> aae-status` and also in the logs, which should have messages along the >>>> lines of >>>> >>>> 2016-03-24 10:28:25.372 [info] >>>> <0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during >>>> active anti-entropy exchange of partition >>>> 1210306043414653979137426502093171875652569137152 for preflist >>>> {1164634117248063262943561351070788031288321245184,3} >>>> >>>> >>>> Re-indexing can put additional strain on the cluster and may cause >>>> elevated latency on a cluster already under heavy load. Please monitor the >>>> response times while the cluster is re-indexing data. >>>> >>>> If the cluster load allows it, you can force more rapid re-indexing by >>>> changing a few parameters. Again at the `riak attach` console, run >>>> >>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>> anti_entropy_build_limit, {4, 6}], 5000). >>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>> anti_entropy_concurrency, 5], 5000). >>>> >>>> This will allow up to 4 trees per node to be built/exchanged per hour, >>>> with up to 5 concurrent exchanges throughout the cluster. To return back to >>>> the default settings, use >>>> >>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>> anti_entropy_build_limit, {1, 36}], 5000). >>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>> anti_entropy_concurrency, 2], 5000). >>>> >>>> >>>> If the cluster still doesn't make any progress with automatically >>>> re-indexing data, the next steps are pretty much what you already >>>> suggested, to drop the existing index and re-index from scratch. I'm >>>> assuming that losing the indexes temporarily is acceptable to you at this >>>> point. >>>> >>>> Using any client API that supports RpbYokozu
Re: Yokozuna inconsistent search results
Continuation... The new index has the same inconsistent search results problem. I was making a snapshot of `search aae-status` command almost each day. There were absolutely no Yokozuna errors in logs. I can see that some AAE trees were not expired (built > 20 days ago). I can also see that on two nodes (of 5) last AAE exchanges happened > 20 days ago. For now I have issued ` riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000).` on each node again. I will wait 10 days more but I don't think that will fix anything. On 25 March 2016 at 09:28, Oleksiy Krivoshey wrote: > One interesting moment happened when I tried removing the index: > > - this index was associated with a bucket type, called fs_chunks > - so I first called RpbSetBucketTypeReq to set search_index: _dont_index_ > - i then tried to remove the index with RpbYokozunaIndexDeleteReq which > failed with "index is in use" and list of all buckets of the fs_chunks type > - for some reason all these buckets had their own search_index property > set to that same index > > How can this happen if I definitely never set the search_index property > per bucket? > > On 24 March 2016 at 22:41, Oleksiy Krivoshey wrote: > >> OK! >> >> On 24 March 2016 at 21:11, Magnus Kessler wrote: >> >>> Hi Oleksiy, >>> >>> On 24 March 2016 at 14:55, Oleksiy Krivoshey wrote: >>> >>>> Hi Magnus, >>>> >>>> Thanks! I guess I will go with index deletion because I've already >>>> tried expiring the trees before. >>>> >>>> Do I need to delete AAE data somehow or removing the index is enough? >>>> >>> >>> If you expire the AAE trees with the commands I posted earlier, there >>> should be no need to remove the AAE data directories manually. >>> >>> I hope this works for you. Please monitor the tree rebuild and exchanges >>> with `riak-admin search aae-status` for the next few days. In particular >>> the exchanges should be ongoing on a continuous basis once all trees have >>> been rebuilt. If they don't, please let me know. At that point you should >>> also gather `riak-debug` output from all nodes before it gets rotated out >>> after 5 days by default. >>> >>> Kind Regards, >>> >>> Magnus >>> >>> >>>> >>>> On 24 March 2016 at 13:28, Magnus Kessler wrote: >>>> >>>>> Hi Oleksiy, >>>>> >>>>> As a first step, I suggest to simply expire the Yokozuna AAE trees >>>>> again if the output of `riak-admin search aae-status` still suggests that >>>>> no recent exchanges have taken place. To do this, run `riak attach` on one >>>>> node and then >>>>> >>>>> riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], >>>>> 5000). >>>>> >>>>> >>>>> Exit from the riak console with `Ctrl+G q`. >>>>> >>>>> Depending on your settings and amount of data the full index should be >>>>> rebuilt within the next 2.5 days (for a cluster with ring size 128 and >>>>> default settings). You can monitor the progress with `riak-admin search >>>>> aae-status` and also in the logs, which should have messages along the >>>>> lines of >>>>> >>>>> 2016-03-24 10:28:25.372 [info] >>>>> <0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during >>>>> active anti-entropy exchange of partition >>>>> 1210306043414653979137426502093171875652569137152 for preflist >>>>> {1164634117248063262943561351070788031288321245184,3} >>>>> >>>>> >>>>> Re-indexing can put additional strain on the cluster and may cause >>>>> elevated latency on a cluster already under heavy load. Please monitor the >>>>> response times while the cluster is re-indexing data. >>>>> >>>>> If the cluster load allows it, you can force more rapid re-indexing by >>>>> changing a few parameters. Again at the `riak attach` console, run >>>>> >>>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>>> anti_entropy_build_limit, {4, 6}], 5000). >>>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>>> anti_entropy_concurrency, 5], 5000). >>>>> >>>>> This will allow up to 4 trees per node to b
Re: Yokozuna inconsistent search results
How can I check that AAE trees have expired? Yesterday I ran " riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000)." on each node (just to be sure). Still today I see that on 3 nodes (of 5) all entropy tress and all last AAE exchanges are older than 20 days. On 4 April 2016 at 17:15, Oleksiy Krivoshey wrote: > Continuation... > > The new index has the same inconsistent search results problem. > I was making a snapshot of `search aae-status` command almost each day. > There were absolutely no Yokozuna errors in logs. > > I can see that some AAE trees were not expired (built > 20 days ago). I > can also see that on two nodes (of 5) last AAE exchanges happened > 20 days > ago. > > For now I have issued > ` riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], > 5000).` on each node again. I will wait 10 days more but I don't think that > will fix anything. > > > On 25 March 2016 at 09:28, Oleksiy Krivoshey wrote: > >> One interesting moment happened when I tried removing the index: >> >> - this index was associated with a bucket type, called fs_chunks >> - so I first called RpbSetBucketTypeReq to set search_index: _dont_index_ >> - i then tried to remove the index with RpbYokozunaIndexDeleteReq which >> failed with "index is in use" and list of all buckets of the fs_chunks type >> - for some reason all these buckets had their own search_index property >> set to that same index >> >> How can this happen if I definitely never set the search_index property >> per bucket? >> >> On 24 March 2016 at 22:41, Oleksiy Krivoshey wrote: >> >>> OK! >>> >>> On 24 March 2016 at 21:11, Magnus Kessler wrote: >>> >>>> Hi Oleksiy, >>>> >>>> On 24 March 2016 at 14:55, Oleksiy Krivoshey >>>> wrote: >>>> >>>>> Hi Magnus, >>>>> >>>>> Thanks! I guess I will go with index deletion because I've already >>>>> tried expiring the trees before. >>>>> >>>>> Do I need to delete AAE data somehow or removing the index is enough? >>>>> >>>> >>>> If you expire the AAE trees with the commands I posted earlier, there >>>> should be no need to remove the AAE data directories manually. >>>> >>>> I hope this works for you. Please monitor the tree rebuild and >>>> exchanges with `riak-admin search aae-status` for the next few days. In >>>> particular the exchanges should be ongoing on a continuous basis once all >>>> trees have been rebuilt. If they don't, please let me know. At that point >>>> you should also gather `riak-debug` output from all nodes before it gets >>>> rotated out after 5 days by default. >>>> >>>> Kind Regards, >>>> >>>> Magnus >>>> >>>> >>>>> >>>>> On 24 March 2016 at 13:28, Magnus Kessler wrote: >>>>> >>>>>> Hi Oleksiy, >>>>>> >>>>>> As a first step, I suggest to simply expire the Yokozuna AAE trees >>>>>> again if the output of `riak-admin search aae-status` still suggests that >>>>>> no recent exchanges have taken place. To do this, run `riak attach` on >>>>>> one >>>>>> node and then >>>>>> >>>>>> riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], >>>>>> 5000). >>>>>> >>>>>> >>>>>> Exit from the riak console with `Ctrl+G q`. >>>>>> >>>>>> Depending on your settings and amount of data the full index should >>>>>> be rebuilt within the next 2.5 days (for a cluster with ring size 128 and >>>>>> default settings). You can monitor the progress with `riak-admin search >>>>>> aae-status` and also in the logs, which should have messages along the >>>>>> lines of >>>>>> >>>>>> 2016-03-24 10:28:25.372 [info] >>>>>> <0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during >>>>>> active anti-entropy exchange of partition >>>>>> 1210306043414653979137426502093171875652569137152 for preflist >>>>>> {1164634117248063262943561351070788031288321245184,3} >>>>>> >>>>>> >>>>>> Re-indexing can put additional strain on the cluster and may cause >&g
Re: Yokozuna inconsistent search results
Hi Fred, Thanks for internal call tips, I'll dig deeper! I've attached recent results of `riak-admin search aae-status` from all nodes. On 5 April 2016 at 22:41, Fred Dushin wrote: > Hi Oleksiy, > > I assume you are getting this information through riak-admin. Can you > post the results here? > > If you want to dig deeper, you can probe the individual hash trees for > their build time. I will paste a few snippets of erlang here, which I am > hoping you can extend to use with list comprehensions and rpc:multicalls. > If that's too much to ask, let us know and I can try to put something > together that is more "big easy button". > > First, on any individual node, you can get the Riak partitions on that > node, via > > (dev1@127.0.0.1)1> Partitons = [P || {_, P, _} <- > riak_core_vnode_manager:all_vnodes(riak_kv_vnode)]. > [913438523331814323877303020447676887284957839360, > 182687704666362864775460604089535377456991567872, > 1187470080331358621040493926581979953470445191168, > 730750818665451459101842416358141509827966271488, > 1370157784997721485815954530671515330927436759040, > 1004782375664995756265033322492444576013453623296, > 822094670998632891489572718402909198556462055424, > 456719261665907161938651510223838443642478919680, > 274031556999544297163190906134303066185487351808, > 1096126227998177188652763624537212264741949407232, > 365375409332725729550921208179070754913983135744, > 91343852333181432387730302044767688728495783936, > 639406966332270026714112114313373821099470487552,0, > 1278813932664540053428224228626747642198940975104, > 548063113999088594326381812268606132370974703616] > > For any one partition, you can get to the Pid associated with the > yz_index_hashtree associated with that partition, e.g., > > (dev1@127.0.0.1)2> {ok, Pid} = > yz_entropy_mgr:get_tree(913438523331814323877303020447676887284957839360). > {ok,<0.2872.0>} > > and from there you can get the state information about the hahstree, which > includes its build time. You can read the record definitions associated > with the yz_index_hashtree state by calling rr() on the yz_index_hashtree > module first, if you want to make the state slightly more readable: > > (dev1@127.0.0.1)3> rr(yz_index_hashtree). > [entropy_data,state,xmerl_event,xmerl_fun_states, > xmerl_scanner,xmlAttribute,xmlComment,xmlContext,xmlDecl, > xmlDocument,xmlElement,xmlNamespace,xmlNode,xmlNsNode, > xmlObj,xmlPI,xmlText] > (dev1@127.0.0.1)5> sys:get_state(Pid). > #state{index = 913438523331814323877303020447676887284957839360, >built = true,expired = false,lock = undefined, >path = > "./data/yz_anti_entropy/913438523331814323877303020447676887284957839360", >build_time = {1459,801655,506719}, >trees = [{{867766597165223607683437869425293042920709947392, > 3}, > {state,<<152,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...>>, > > 913438523331814323877303020447676887284957839360,3,1048576, > 1024,0, > {dict,0,16,16,8,80,48,{[],[],...},{{...}}}, > <<>>, > > "./data/yz_anti_entropy/913438523331814323877303020447676887284957839360", > <<>>,incremental,[],0, > {array,38837,0,...}}}, > {{890602560248518965780370444936484965102833893376,3}, > {state,<<156,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...>>, > > 913438523331814323877303020447676887284957839360,3,1048576, > 1024,0, > {dict,0,16,16,8,80,48,{[],...},{...}}, > <<>>, > > "./data/yz_anti_entropy/913438523331814323877303020447676887284957839360", > <<>>,incremental,[],0, > {array,38837,...}}}, > {{913438523331814323877303020447676887284957839360,3}, > {state,<<160,0,0,0,0,0,0,0,0,0,0,0,0,0,...>>, > > 913438523331814323877303020447676887284957839360,3,1048576, > 1024,0, > {dict,0,16,16,8,80,48,{...},...}, > <<>>, > > "./data/yz_anti_entropy/913438523331814323877303020447676887284957839360", > <<>>,incremental,[],0, > {array,...}}}], >closed = false} > > You can convert the timestamp to local time via: > > (dev1@127.0.0.1)8> calendar:now_to_local_time({1459,801655,506719}). > {{2016,4,4},{16,27,35}} > > > Again, this is just an e
Re: Yokozuna inconsistent search results
So after 2 more days I can still see AAE trees that haven't been rebuilt for 30 days, I can also see that some trees didn't have exchanges for the same period. I still have inconsistent search results from Yokozuna. To summarise this long discussion: - I have fixed all Solr indexing issues though none of them was related to the search index in question - there were no Solr indexing issues for 30 days - search schema for this index doesn't have any fields beside required _yk_* - I have dropped and re-created this search index two times - I have tried expiring AAE trees 3 times - some (quite a lot) of AAE trees don't have exchanges and are not rebuilt Last output of `search aae-status` from all nodes attached. On 5 April 2016 at 22:54, Oleksiy Krivoshey wrote: > Hi Fred, > > Thanks for internal call tips, I'll dig deeper! > > I've attached recent results of `riak-admin search aae-status` from all > nodes. > > > On 5 April 2016 at 22:41, Fred Dushin wrote: > >> Hi Oleksiy, >> >> I assume you are getting this information through riak-admin. Can you >> post the results here? >> >> If you want to dig deeper, you can probe the individual hash trees for >> their build time. I will paste a few snippets of erlang here, which I am >> hoping you can extend to use with list comprehensions and rpc:multicalls. >> If that's too much to ask, let us know and I can try to put something >> together that is more "big easy button". >> >> First, on any individual node, you can get the Riak partitions on that >> node, via >> >> (dev1@127.0.0.1)1> Partitons = [P || {_, P, _} <- >> riak_core_vnode_manager:all_vnodes(riak_kv_vnode)]. >> [913438523331814323877303020447676887284957839360, >> 182687704666362864775460604089535377456991567872, >> 1187470080331358621040493926581979953470445191168, >> 730750818665451459101842416358141509827966271488, >> 1370157784997721485815954530671515330927436759040, >> 1004782375664995756265033322492444576013453623296, >> 822094670998632891489572718402909198556462055424, >> 456719261665907161938651510223838443642478919680, >> 274031556999544297163190906134303066185487351808, >> 1096126227998177188652763624537212264741949407232, >> 365375409332725729550921208179070754913983135744, >> 91343852333181432387730302044767688728495783936, >> 639406966332270026714112114313373821099470487552,0, >> 1278813932664540053428224228626747642198940975104, >> 548063113999088594326381812268606132370974703616] >> >> For any one partition, you can get to the Pid associated with the >> yz_index_hashtree associated with that partition, e.g., >> >> (dev1@127.0.0.1)2> {ok, Pid} = >> yz_entropy_mgr:get_tree(913438523331814323877303020447676887284957839360). >> {ok,<0.2872.0>} >> >> and from there you can get the state information about the hahstree, >> which includes its build time. You can read the record definitions >> associated with the yz_index_hashtree state by calling rr() on the >> yz_index_hashtree module first, if you want to make the state slightly more >> readable: >> >> (dev1@127.0.0.1)3> rr(yz_index_hashtree). >> [entropy_data,state,xmerl_event,xmerl_fun_states, >> xmerl_scanner,xmlAttribute,xmlComment,xmlContext,xmlDecl, >> xmlDocument,xmlElement,xmlNamespace,xmlNode,xmlNsNode, >> xmlObj,xmlPI,xmlText] >> (dev1@127.0.0.1)5> sys:get_state(Pid). >> #state{index = 913438523331814323877303020447676887284957839360, >>built = true,expired = false,lock = undefined, >>path = >> "./data/yz_anti_entropy/913438523331814323877303020447676887284957839360", >>build_time = {1459,801655,506719}, >>trees = [{{867766597165223607683437869425293042920709947392, >> 3}, >> {state,<<152,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...>>, >> >> 913438523331814323877303020447676887284957839360,3,1048576, >> 1024,0, >> {dict,0,16,16,8,80,48,{[],[],...},{{...}}}, >> <<>>, >> >> "./data/yz_anti_entropy/913438523331814323877303020447676887284957839360", >> <<>>,incremental,[],0, >> {array,38837,0,...}}}, >> {{890602560248518965780370444936484965102833893376,3}, >> {state,<<156,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...>>, >> >> 913438523331814323877303020447676887284957839360,3,1048576, >>