Playing with / understanding Riak configurations

2016-07-27 Thread Vikram Lalit
Hi - I have a Riak node with n_val=3, r=2, w=2 and have just one key-object
stored there-in. I'm trying to test various configurations to better
understand the system and have the following observations - some dont seem
to align with my understanding so far, so appreciate if someone can throw
some light please... Thanks!

1. n=3, r=2, w=2: Base state, 1 key-value pair.

2. Change to n=2, r=2, w=2: When I query from my client, I randomly see 1
or 2 values being fetched. In fact, the number of keys fetched is 1 or 2,
randomly changing each time the client queries the db. Ideally, I would
have expected that if we reduce the n_val, there would be data loss from
one of the vnodes. And that for this scenario, I would still expect only 1
(remaining) key-value pair to be read from the remaining two vnodes that
has the data. Note that I dont intend to make such a change in production
as cognizant of the recommendation to never decrease the value of n, but
have done so only to test out the details.

3. Then change to n=2, r=1, w=1: I get the same alternating result as
above, i.e. 1 or 2 values being fetched.

4. Then change to n=1, r=1, w=1: I get 3 key-value pairs, all identical,
from the database. Again, are these all siblings?
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Playing with / understanding Riak configurations

2016-07-27 Thread Tom Santero
Vikram,

John Daily wrote a fantastic blog series that places your question in
context and then answers it.

http://basho.com/posts/technical/understanding-riaks-configurable-behaviors-part-1/

Tom

On Wed, Jul 27, 2016 at 4:07 PM, Vikram Lalit  wrote:

> Hi - I have a Riak node with n_val=3, r=2, w=2 and have just one
> key-object stored there-in. I'm trying to test various configurations to
> better understand the system and have the following observations - some
> dont seem to align with my understanding so far, so appreciate if someone
> can throw some light please... Thanks!
>
> 1. n=3, r=2, w=2: Base state, 1 key-value pair.
>
> 2. Change to n=2, r=2, w=2: When I query from my client, I randomly see 1
> or 2 values being fetched. In fact, the number of keys fetched is 1 or 2,
> randomly changing each time the client queries the db. Ideally, I would
> have expected that if we reduce the n_val, there would be data loss from
> one of the vnodes. And that for this scenario, I would still expect only 1
> (remaining) key-value pair to be read from the remaining two vnodes that
> has the data. Note that I dont intend to make such a change in production
> as cognizant of the recommendation to never decrease the value of n, but
> have done so only to test out the details.
>
> 3. Then change to n=2, r=1, w=1: I get the same alternating result as
> above, i.e. 1 or 2 values being fetched.
>
> 4. Then change to n=1, r=1, w=1: I get 3 key-value pairs, all identical,
> from the database. Again, are these all siblings?
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Playing with / understanding Riak configurations

2016-07-27 Thread Vikram Lalit
Thanks Tom... Yes I did read that but I couldn't deduce the outcome if n is
decreased. John talks about data loss, but am actually observing a
different result... perhaps am missing something!


On Wed, Jul 27, 2016 at 6:11 PM, Tom Santero  wrote:

> Vikram,
>
> John Daily wrote a fantastic blog series that places your question in
> context and then answers it.
>
>
> http://basho.com/posts/technical/understanding-riaks-configurable-behaviors-part-1/
>
> Tom
>
> On Wed, Jul 27, 2016 at 4:07 PM, Vikram Lalit 
> wrote:
>
>> Hi - I have a Riak node with n_val=3, r=2, w=2 and have just one
>> key-object stored there-in. I'm trying to test various configurations to
>> better understand the system and have the following observations - some
>> dont seem to align with my understanding so far, so appreciate if someone
>> can throw some light please... Thanks!
>>
>> 1. n=3, r=2, w=2: Base state, 1 key-value pair.
>>
>> 2. Change to n=2, r=2, w=2: When I query from my client, I randomly see 1
>> or 2 values being fetched. In fact, the number of keys fetched is 1 or 2,
>> randomly changing each time the client queries the db. Ideally, I would
>> have expected that if we reduce the n_val, there would be data loss from
>> one of the vnodes. And that for this scenario, I would still expect only 1
>> (remaining) key-value pair to be read from the remaining two vnodes that
>> has the data. Note that I dont intend to make such a change in production
>> as cognizant of the recommendation to never decrease the value of n, but
>> have done so only to test out the details.
>>
>> 3. Then change to n=2, r=1, w=1: I get the same alternating result as
>> above, i.e. 1 or 2 values being fetched.
>>
>> 4. Then change to n=1, r=1, w=1: I get 3 key-value pairs, all identical,
>> from the database. Again, are these all siblings?
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


riak TS max concurrent queries + overload error

2016-07-27 Thread Chris.Johnson
Hello!

We are experiencing error messages from the client that we don’t totally 
understand. They look like the following:



Checking the riak error and crash logs, I’m seeing “overload” errors which I 
assume is causing the “no response from backend” client errors:

{error,
 badarg,
 [{erlang,iolist_to_binary,[overload],[]},
  
{riak_kv_ts_svc,make_rpberrresp,2,[{file,"src/riak_kv_ts_svc.erl"},{line,483}]},
  
{riak_kv_ts_svc,sub_tsqueryreq,4,[{file,"src/riak_kv_ts_svc.erl"},{line,445}]},
  {riak_kv_pb_ts,process,2,[{file,"src/riak_kv_pb_ts.erl"},{line,71}]},
  
{riak_api_pb_server,process_message,4,[{file,"src/riak_api_pb_server.erl"},{line,388}]},
  
{riak_api_pb_server,connected,2,[{file,"src/riak_api_pb_server.erl"},{line,226}]},
  {riak_api_pb_server,decode_buffer,2,[{file,...},...]},...]}

I’m curious if these overload errors are caused by clients requesting more 
concurrent TS queries than our current setting for 
timeseries_max_concurrent_queries allows OR if the the 
timeseries_max_concurrent_queries is set too high and we are causing riak to 
crash.

Do you have any recommendations on what timeseries_max_concurrent_queries 
should be set to relative to hardward specs? I assume it should be limited 
based on disk I/O bandwidth.

Also, does anyone have any recommendations on query pooling so we can guarantee 
that multiple clients will not generate more queries than the cluster can 
handle? I like HAProxy for HTTP connection pooling but it doesn’t seem like it 
would work well for limiting the number of global queries from multiple PBC 
clients.

Thank you!

Chris
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Playing with / understanding Riak configurations

2016-07-27 Thread Tom Santero
Vikram,

I apologize, I initially just skimmed your question and thought you were
asking something entirely different.

While increasing your N value is safe, decreasing it on a bucket with
pre-existing data, as you have, is not-recommended and the source of your
inconsistent results.

Tom

On Wed, Jul 27, 2016 at 4:18 PM, Vikram Lalit  wrote:

> Thanks Tom... Yes I did read that but I couldn't deduce the outcome if n
> is decreased. John talks about data loss, but am actually observing a
> different result... perhaps am missing something!
>
>
> On Wed, Jul 27, 2016 at 6:11 PM, Tom Santero  wrote:
>
>> Vikram,
>>
>> John Daily wrote a fantastic blog series that places your question in
>> context and then answers it.
>>
>>
>> http://basho.com/posts/technical/understanding-riaks-configurable-behaviors-part-1/
>>
>> Tom
>>
>> On Wed, Jul 27, 2016 at 4:07 PM, Vikram Lalit 
>> wrote:
>>
>>> Hi - I have a Riak node with n_val=3, r=2, w=2 and have just one
>>> key-object stored there-in. I'm trying to test various configurations to
>>> better understand the system and have the following observations - some
>>> dont seem to align with my understanding so far, so appreciate if someone
>>> can throw some light please... Thanks!
>>>
>>> 1. n=3, r=2, w=2: Base state, 1 key-value pair.
>>>
>>> 2. Change to n=2, r=2, w=2: When I query from my client, I randomly see
>>> 1 or 2 values being fetched. In fact, the number of keys fetched is 1 or 2,
>>> randomly changing each time the client queries the db. Ideally, I would
>>> have expected that if we reduce the n_val, there would be data loss from
>>> one of the vnodes. And that for this scenario, I would still expect only 1
>>> (remaining) key-value pair to be read from the remaining two vnodes that
>>> has the data. Note that I dont intend to make such a change in production
>>> as cognizant of the recommendation to never decrease the value of n, but
>>> have done so only to test out the details.
>>>
>>> 3. Then change to n=2, r=1, w=1: I get the same alternating result as
>>> above, i.e. 1 or 2 values being fetched.
>>>
>>> 4. Then change to n=1, r=1, w=1: I get 3 key-value pairs, all identical,
>>> from the database. Again, are these all siblings?
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak TS max concurrent queries + overload error

2016-07-27 Thread Cian Synnott
Hi Chris,

This sounds like the issue described at
  https://github.com/basho/riak_kv/issues/1418

On Wed, Jul 27, 2016 at 11:19 PM,   wrote:
> Also, does anyone have any recommendations on query pooling so we can
> guarantee that multiple clients will not generate more queries than the
> cluster can handle?
>
Probably the right thing to do (when the RPC server is fixed) is to
have the clients independently heck for backpressure from Riak (e.g.
overload messages like this), retry with exponential backoff, and have
each retry increment a counter somewhere in your monitoring system to
make that problem visible.

This should allow you to handle overload (somewhat) gracefully,
respond to critical events (e.g. an alert), or to see any overload
trends over time.

Cian

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak TS max concurrent queries + overload error

2016-07-27 Thread Chris.Johnson
Hi Cian,

Thank you! I should've mentioned in my initial email that I thought we were 
experiencing the same bug you called out (in fact the 2nd comment on that 
github issue is actually from me).

So, what I'm really curious about is whether or not the original "overload" 
error is happening because we're hitting the limit on TS max concurrent queries 
or if riak is actually "overloaded" and we shouldn't increase the configuration 
value for max concurrent queries.

I'd like to know whether or not I should expect a certain value for max 
concurrent queries to be stable and performant for some given hardware specs. 
This is an experiment that we will probably run in house to determine a good 
value, but it would be great to know what range is expected to perform well.

Also, I have no idea if the max concurrent queries setting includes subqueries 
over multiple quanta. For instance, if I have 4 TS queries hitting a riak node 
configured for 12 max queries and each query spans 3 - 4 quanta, should i 
expect an "overload" error?

Thank you for the advice on implementing client backoff! Hopefully, we can do 
that as well as increase the overall TS query capacity of our cluster with a 
simple configuration change. I'm suspicious that we have a very conservative 
value at the moment.

Chris

From: Cian Synnott 
Sent: Wednesday, July 27, 2016 6:03 PM
To: Johnson Chris CJOH
Cc: riak-users@lists.basho.com
Subject: Re: riak TS max concurrent queries + overload error

Hi Chris,

This sounds like the issue described at
  https://github.com/basho/riak_kv/issues/1418

On Wed, Jul 27, 2016 at 11:19 PM,   wrote:
> Also, does anyone have any recommendations on query pooling so we can
> guarantee that multiple clients will not generate more queries than the
> cluster can handle?
>
Probably the right thing to do (when the RPC server is fixed) is to
have the clients independently heck for backpressure from Riak (e.g.
overload messages like this), retry with exponential backoff, and have
each retry increment a counter somewhere in your monitoring system to
make that problem visible.

This should allow you to handle overload (somewhat) gracefully,
respond to critical events (e.g. an alert), or to see any overload
trends over time.

Cian

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com