Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Ingo Rockel

Hi Evan,

thanks for all the infos! I adjusted the leveldb-config as suggested, 
except the cache, which I reduced to 16MB, keeping this above the 
default helped a lot at least during load testing. And I added +P 130072 
to the vm.args. Will be applied to the riak nodes the next hours.


We have a monitoring using zabbbix, but haven't included the object 
sizes so far, will be added today.


We double-checked the Linux-Performance-Doc to be sure everything is 
applied to the nodes, especially as the problems always are caused from 
the same three nodes. But everything looks fine.


Ingo

Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:

Another engineer mentions that you posted your eleveldb section and I
totally missed it:

The eleveldb section:

  %% eLevelDB Config
  {eleveldb, [
  {data_root, "/var/lib/riak/leveldb"},
  {cache_size, 33554432},
  {write_buffer_size_min, 67108864}, %% 64 MB in bytes
  {write_buffer_size_max, 134217728}, %% 128 MB in bytes
  {max_open_files, 4000}
 ]},

This is likely going to make you unhappy as time goes on; Since all of
those settings are per-vnode, your max memory utilization is well
beyond your physical memory.  I'd remove the tunings for the caches
and buffers and drop max open files to 500, perhaps.  Make sure that
you've followed everything in:
http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
etc.

On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
 wrote:

Again, all of these things are signs of large objects, so if you could
track the object_size stats on the cluster, I think that we might see
something.  Even if you have no monitoring, a simple shell script
curling /stats/ on each node once a minute should do the job for a day
or two.

On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
 wrote:

We just had it again (around this time of the day we have our highest user
activity).

I will set +P to 131072 tomorrow, anything else I should check or change?

What about this memory-high-watermark which I get sporadically?

Ingo

Am 03.04.2013 17:57, schrieb Evan Vigil-McClanahan:


As for +P it's been raised in R16 (which is on the current man page)
on R15 it's only 32k.

The behavior that you're describing does sound like a very large
object getting put into the cluster (which may cause backups and push
you up against the process limit, could have caused scheduler collapse
on 1.2, etc.).

On Wed, Apr 3, 2013 at 8:39 AM, Ingo Rockel
 wrote:


Evan,

sys_process_count is somewhere between 5k and 11k on the nodes right now.
Concerning your suggested +P config, according to the erlang-docs, the
default for this param already is 262144, so setting it to 655536 would
in
fact lower it?

We chose the ring size to be able to handle growth which was the main
reason
to switch from mysql to nosql/riak. We have 12 Nodes, so about 86 vnodes
per
node.

No, we don't monitor object sizes, the majority of objects is very small
(below 200 bytes), but we have objects storing references to this small
objects which might grow to a few megabytes in size, most of these are
paged
though and should not exceed one megabyte. Only one type is not paged
(implementation reasons).

The outgoing/incoming traffic constantly is around 100 Mbit, if the
peformance drops happen, we suddenly see spikes up to 1GBit. And these
spikes constantly happen on three nodes as long as the performance drop
exists.

Ingo

Am 03.04.2013 17:12, schrieb Evan Vigil-McClanahan:


Ingo,

riak-admin status | grep sys_process_count

will tell you how many processes are running.  The default process
limit on erlang is a little low, and we'd suggest raising in
(especially with your extra-large ring_size).   Erlang processes are
cheap, so 65535 or even double that will be fine.

Busy dist ports are still worrying.  Are you monitoring object sizes?
Are there any spikes there associated with performance drops?

On Wed, Apr 3, 2013 at 8:03 AM, Ingo Rockel
 wrote:



Hi Evan,

I set swt very_low and zdbbl to 64MB, setting this params helped
reducing
the busy_dist_port and Monitor got {suppressed,... Messages a lot. But
when
the performance of the cluster suddenly drops we still see these
messages.

The cluster was updated to 1.3 in the meantime.

The eleveldb section:

%% eLevelDB Config
{eleveldb, [
{data_root, "/var/lib/riak/leveldb"},
{cache_size, 33554432},
{write_buffer_size_min, 67108864}, %% 64 MB in bytes
{write_buffer_size_max, 134217728}, %% 128 MB in bytes
{max_open_files, 4000}
   ]},

the ring size is 1024 and the machines have 48GB of memory. Concerning
the
params from vm.args:

-env ERL_MAX_PORTS 4096
-env ERL_MAX_ETS_TABLES 8192

+P isn't set

Ingo

Am 03.04.2013 16:53, schrieb Evan Vigil-McClanahan:


For your prior mail, I thought that someone had answered.  Our initial
suggestion was to add +swt very_low to your vm.args,

Comparison against DynamoDB

2013-04-04 Thread David Koblas
Spent some time with the AWS folks the other day and was getting sold on 
using DynamoDB for some of our large Key Value store needs. However 
given the read/write economics of DynamoDB vs. Instance+Storage costs on 
Riak I was wondering if anybody has done a good thinking around where 
the cost inflections points are?


Also before I go and benchmark things how does Riak perform with 2B 
entries which are < 1K in size, when I last did the benchmarks - just 
before 1.0 - there were a few issues.


Thanks,
David



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Evan Vigil-McClanahan
If it's always the same three nodes it could well be same very large
object being updated each day.  Is there anything else that looks
suspicious in your logs?  Another sign of large objects is large_heap
(or long_gc) messages from riak_sysmon.

On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
 wrote:
> Hi Evan,
>
> thanks for all the infos! I adjusted the leveldb-config as suggested, except
> the cache, which I reduced to 16MB, keeping this above the default helped a
> lot at least during load testing. And I added +P 130072 to the vm.args. Will
> be applied to the riak nodes the next hours.
>
> We have a monitoring using zabbbix, but haven't included the object sizes so
> far, will be added today.
>
> We double-checked the Linux-Performance-Doc to be sure everything is applied
> to the nodes, especially as the problems always are caused from the same
> three nodes. But everything looks fine.
>
> Ingo
>
> Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:
>
>> Another engineer mentions that you posted your eleveldb section and I
>> totally missed it:
>>
>> The eleveldb section:
>>
>>   %% eLevelDB Config
>>   {eleveldb, [
>>   {data_root, "/var/lib/riak/leveldb"},
>>   {cache_size, 33554432},
>>   {write_buffer_size_min, 67108864}, %% 64 MB in bytes
>>   {write_buffer_size_max, 134217728}, %% 128 MB in bytes
>>   {max_open_files, 4000}
>>  ]},
>>
>> This is likely going to make you unhappy as time goes on; Since all of
>> those settings are per-vnode, your max memory utilization is well
>> beyond your physical memory.  I'd remove the tunings for the caches
>> and buffers and drop max open files to 500, perhaps.  Make sure that
>> you've followed everything in:
>> http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
>> etc.
>>
>> On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
>>  wrote:
>>>
>>> Again, all of these things are signs of large objects, so if you could
>>> track the object_size stats on the cluster, I think that we might see
>>> something.  Even if you have no monitoring, a simple shell script
>>> curling /stats/ on each node once a minute should do the job for a day
>>> or two.
>>>
>>> On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
>>>  wrote:

 We just had it again (around this time of the day we have our highest
 user
 activity).

 I will set +P to 131072 tomorrow, anything else I should check or
 change?

 What about this memory-high-watermark which I get sporadically?

 Ingo

 Am 03.04.2013 17:57, schrieb Evan Vigil-McClanahan:

> As for +P it's been raised in R16 (which is on the current man page)
> on R15 it's only 32k.
>
> The behavior that you're describing does sound like a very large
> object getting put into the cluster (which may cause backups and push
> you up against the process limit, could have caused scheduler collapse
> on 1.2, etc.).
>
> On Wed, Apr 3, 2013 at 8:39 AM, Ingo Rockel
>  wrote:
>>
>>
>> Evan,
>>
>> sys_process_count is somewhere between 5k and 11k on the nodes right
>> now.
>> Concerning your suggested +P config, according to the erlang-docs, the
>> default for this param already is 262144, so setting it to 655536
>> would
>> in
>> fact lower it?
>>
>> We chose the ring size to be able to handle growth which was the main
>> reason
>> to switch from mysql to nosql/riak. We have 12 Nodes, so about 86
>> vnodes
>> per
>> node.
>>
>> No, we don't monitor object sizes, the majority of objects is very
>> small
>> (below 200 bytes), but we have objects storing references to this
>> small
>> objects which might grow to a few megabytes in size, most of these are
>> paged
>> though and should not exceed one megabyte. Only one type is not paged
>> (implementation reasons).
>>
>> The outgoing/incoming traffic constantly is around 100 Mbit, if the
>> peformance drops happen, we suddenly see spikes up to 1GBit. And these
>> spikes constantly happen on three nodes as long as the performance
>> drop
>> exists.
>>
>> Ingo
>>
>> Am 03.04.2013 17:12, schrieb Evan Vigil-McClanahan:
>>
>>> Ingo,
>>>
>>> riak-admin status | grep sys_process_count
>>>
>>> will tell you how many processes are running.  The default process
>>> limit on erlang is a little low, and we'd suggest raising in
>>> (especially with your extra-large ring_size).   Erlang processes are
>>> cheap, so 65535 or even double that will be fine.
>>>
>>> Busy dist ports are still worrying.  Are you monitoring object sizes?
>>> Are there any spikes there associated with performance drops?
>>>
>>> On Wed, Apr 3, 2013 at 8:03 AM, Ingo Rockel
>>>  wrote:



 Hi Evan,

 I set swt very_low and z

Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Ingo Rockel

Hi Evan,

we added monitoring of the object sizes and there was one object on one 
of the three nodes mentioned which was > 2GB!!


We just changed the application code to get the id of this object to be 
able to delete it. But is does happen only about once a day.


We right now have another node constantly crashing with oom about 12 
minutes after start (always the same time frame), could this be related 
to the big object issue? It is not one of the three nodes. The node logs 
a lot of handoff receiving is going on.


Again, thanks for the help!

Regards,

Ingo

Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:

If it's always the same three nodes it could well be same very large
object being updated each day.  Is there anything else that looks
suspicious in your logs?  Another sign of large objects is large_heap
(or long_gc) messages from riak_sysmon.

On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
 wrote:

Hi Evan,

thanks for all the infos! I adjusted the leveldb-config as suggested, except
the cache, which I reduced to 16MB, keeping this above the default helped a
lot at least during load testing. And I added +P 130072 to the vm.args. Will
be applied to the riak nodes the next hours.

We have a monitoring using zabbbix, but haven't included the object sizes so
far, will be added today.

We double-checked the Linux-Performance-Doc to be sure everything is applied
to the nodes, especially as the problems always are caused from the same
three nodes. But everything looks fine.

Ingo

Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:


Another engineer mentions that you posted your eleveldb section and I
totally missed it:

The eleveldb section:

   %% eLevelDB Config
   {eleveldb, [
   {data_root, "/var/lib/riak/leveldb"},
   {cache_size, 33554432},
   {write_buffer_size_min, 67108864}, %% 64 MB in bytes
   {write_buffer_size_max, 134217728}, %% 128 MB in bytes
   {max_open_files, 4000}
  ]},

This is likely going to make you unhappy as time goes on; Since all of
those settings are per-vnode, your max memory utilization is well
beyond your physical memory.  I'd remove the tunings for the caches
and buffers and drop max open files to 500, perhaps.  Make sure that
you've followed everything in:
http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
etc.

On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
 wrote:


Again, all of these things are signs of large objects, so if you could
track the object_size stats on the cluster, I think that we might see
something.  Even if you have no monitoring, a simple shell script
curling /stats/ on each node once a minute should do the job for a day
or two.

On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
 wrote:


We just had it again (around this time of the day we have our highest
user
activity).

I will set +P to 131072 tomorrow, anything else I should check or
change?

What about this memory-high-watermark which I get sporadically?

Ingo

Am 03.04.2013 17:57, schrieb Evan Vigil-McClanahan:


As for +P it's been raised in R16 (which is on the current man page)
on R15 it's only 32k.

The behavior that you're describing does sound like a very large
object getting put into the cluster (which may cause backups and push
you up against the process limit, could have caused scheduler collapse
on 1.2, etc.).

On Wed, Apr 3, 2013 at 8:39 AM, Ingo Rockel
 wrote:



Evan,

sys_process_count is somewhere between 5k and 11k on the nodes right
now.
Concerning your suggested +P config, according to the erlang-docs, the
default for this param already is 262144, so setting it to 655536
would
in
fact lower it?

We chose the ring size to be able to handle growth which was the main
reason
to switch from mysql to nosql/riak. We have 12 Nodes, so about 86
vnodes
per
node.

No, we don't monitor object sizes, the majority of objects is very
small
(below 200 bytes), but we have objects storing references to this
small
objects which might grow to a few megabytes in size, most of these are
paged
though and should not exceed one megabyte. Only one type is not paged
(implementation reasons).

The outgoing/incoming traffic constantly is around 100 Mbit, if the
peformance drops happen, we suddenly see spikes up to 1GBit. And these
spikes constantly happen on three nodes as long as the performance
drop
exists.

Ingo

Am 03.04.2013 17:12, schrieb Evan Vigil-McClanahan:


Ingo,

riak-admin status | grep sys_process_count

will tell you how many processes are running.  The default process
limit on erlang is a little low, and we'd suggest raising in
(especially with your extra-large ring_size).   Erlang processes are
cheap, so 65535 or even double that will be fine.

Busy dist ports are still worrying.  Are you monitoring object sizes?
Are there any spikes there associated with performance drops?

On Wed, Apr 3, 2013 at 8:03 AM, Ingo Rockel
 wrote:




Hi Evan,

I set swt very_low and zdbbl to 64MB, setti

Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Ingo Rockel
the crashing node seems to be caused by the raised +P param, after last 
crash I commented the param and now the node runs just fine.


Am 04.04.2013 15:43, schrieb Ingo Rockel:

Hi Evan,

we added monitoring of the object sizes and there was one object on one
of the three nodes mentioned which was > 2GB!!

We just changed the application code to get the id of this object to be
able to delete it. But is does happen only about once a day.

We right now have another node constantly crashing with oom about 12
minutes after start (always the same time frame), could this be related
to the big object issue? It is not one of the three nodes. The node logs
a lot of handoff receiving is going on.

Again, thanks for the help!

Regards,

 Ingo

Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:

If it's always the same three nodes it could well be same very large
object being updated each day.  Is there anything else that looks
suspicious in your logs?  Another sign of large objects is large_heap
(or long_gc) messages from riak_sysmon.

On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
 wrote:

Hi Evan,

thanks for all the infos! I adjusted the leveldb-config as suggested,
except
the cache, which I reduced to 16MB, keeping this above the default
helped a
lot at least during load testing. And I added +P 130072 to the
vm.args. Will
be applied to the riak nodes the next hours.

We have a monitoring using zabbbix, but haven't included the object
sizes so
far, will be added today.

We double-checked the Linux-Performance-Doc to be sure everything is
applied
to the nodes, especially as the problems always are caused from the same
three nodes. But everything looks fine.

Ingo

Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:


Another engineer mentions that you posted your eleveldb section and I
totally missed it:

The eleveldb section:

   %% eLevelDB Config
   {eleveldb, [
   {data_root, "/var/lib/riak/leveldb"},
   {cache_size, 33554432},
   {write_buffer_size_min, 67108864}, %% 64 MB in bytes
   {write_buffer_size_max, 134217728}, %% 128 MB in bytes
   {max_open_files, 4000}
  ]},

This is likely going to make you unhappy as time goes on; Since all of
those settings are per-vnode, your max memory utilization is well
beyond your physical memory.  I'd remove the tunings for the caches
and buffers and drop max open files to 500, perhaps.  Make sure that
you've followed everything in:
http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
etc.

On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
 wrote:


Again, all of these things are signs of large objects, so if you could
track the object_size stats on the cluster, I think that we might see
something.  Even if you have no monitoring, a simple shell script
curling /stats/ on each node once a minute should do the job for a day
or two.

On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
 wrote:


We just had it again (around this time of the day we have our highest
user
activity).

I will set +P to 131072 tomorrow, anything else I should check or
change?

What about this memory-high-watermark which I get sporadically?

Ingo

Am 03.04.2013 17:57, schrieb Evan Vigil-McClanahan:


As for +P it's been raised in R16 (which is on the current man page)
on R15 it's only 32k.

The behavior that you're describing does sound like a very large
object getting put into the cluster (which may cause backups and
push
you up against the process limit, could have caused scheduler
collapse
on 1.2, etc.).

On Wed, Apr 3, 2013 at 8:39 AM, Ingo Rockel
 wrote:



Evan,

sys_process_count is somewhere between 5k and 11k on the nodes
right
now.
Concerning your suggested +P config, according to the
erlang-docs, the
default for this param already is 262144, so setting it to 655536
would
in
fact lower it?

We chose the ring size to be able to handle growth which was the
main
reason
to switch from mysql to nosql/riak. We have 12 Nodes, so about 86
vnodes
per
node.

No, we don't monitor object sizes, the majority of objects is very
small
(below 200 bytes), but we have objects storing references to this
small
objects which might grow to a few megabytes in size, most of
these are
paged
though and should not exceed one megabyte. Only one type is not
paged
(implementation reasons).

The outgoing/incoming traffic constantly is around 100 Mbit, if the
peformance drops happen, we suddenly see spikes up to 1GBit. And
these
spikes constantly happen on three nodes as long as the performance
drop
exists.

Ingo

Am 03.04.2013 17:12, schrieb Evan Vigil-McClanahan:


Ingo,

riak-admin status | grep sys_process_count

will tell you how many processes are running.  The default process
limit on erlang is a little low, and we'd suggest raising in
(especially with your extra-large ring_size).   Erlang
processes are
cheap, so 65535 or even double that will be fine.

Busy dist ports are still worrying.  Are you monitoring object
sizes?
Are t

Re: Do nodes always need to restart after backend selection?

2013-04-04 Thread Jared Morrow
Toby,

That particular page is talking about changing the default settings of the
backend of a bucket.  In that specific case, if you want to change the
default behavior in your app.config file a restart is necessary.  One
particularly important detail there is you don't need to restart *all*
nodes at the same time.  Restarting one node at a time is sufficient and
recommended so you don't have any cluster downtime.

For setting common bucket properties, you do not need to restart the node.
 If you want to change the n_val of a bucket for instance, you can just
change it from your client on all nodes.  That page explains at the bottom
how to set them on the erlang console or curl, but most people use their
chosen client to set bucket properties before writing values.   Here is an
example using the Java Client
http://docs.basho.com/java/latest/cookbooks/buckets/.  In general it
doesn't matter if your client supports HTTP or protocol buffers, both API's
http://docs.basho.com/riak/latest/references/apis/ support bucket property
changes.

Hope that helps,
Jared

On Wed, Apr 3, 2013 at 10:14 PM, Toby Corkindale <
toby.corkind...@strategicdata.com.au> wrote:

> Hi,
> According to the docs at the following URL, it is necessary to reboot all
> Riak nodes after setting the bucket property for backend.
> This seems really drastic, and we'd like to avoid having to do this!
> See:
> http://docs.basho.com/riak/1.**3.0/tutorials/choosing-a-**backend/Multi/
>
> I wondered if the restart of the whole cluster can be avoided? Perhaps we
> could set the bucket properties prior to setting any keys within it?
>
> Thanks in advance,
> Toby
>
> __**_
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: simulate a transaction

2013-04-04 Thread Sean Cribbs
Brisa,

You cannot simulate transactions, really (see also
http://aphyr.com/posts/254-burn-the-library). However, if you want to
receive notifications and take action when something happens, you can
add a postcommit hook (see
http://docs.basho.com/riak/latest/references/appendices/concepts/Commit-Hooks/).
In the past, developers have added hooks that push updates into a
RabbitMQ broker or ZeroMQ channel.

On Wed, Apr 3, 2013 at 7:59 PM, Brisa Jiménez  wrote:
> Hi,
> I want to simulate a transaction.
> I want to know when a riak operation happens.
> I know that riak isn't a relational database, but it is important for me in
> particular cases to undo operations when something goes wrong.
> Any have any suggestion for me about I want to do?
> Thank you very much.
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Sean Cribbs 
Software Engineer
Basho Technologies, Inc.
http://basho.com/

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


error running Erlang m/r job

2013-04-04 Thread Tom Zeng
Hi everyone,

I am trying to run the Erlang m/r following the Riak Handbook, and got the
following error:

(riak@127.0.0.1)4> ExtractTweet = fun(RObject, _, _) ->
(riak@127.0.0.1)4>  {struct, Obj} = mochijson2:decode(
(riak@127.0.0.1)4>riak_object:get_value(RObject)),
(riak@127.0.0.1)4>  [proplists:get_value(<<"tweet">>, Obj)]
(riak@127.0.0.1)4> end.
#Fun
(riak@127.0.0.1)5> C:mapred([{<<"tweets">>, <<"41399579391950848">>}],
(riak@127.0.0.1)5>   [{map, {qfun, ExtractTweet}, none, true}]).
** exception error: undefined function riak_client:mapred/3

I've been using JavaScript for m/r and just started using Erlang per Basho
engineers' recommendation at Riak DC meetup. Any help/pointers appreciated.

Thanks
Tom
-- 
Tom Zeng
Director of Engineering
Intridea, Inc. | www.intridea.com
t...@intridea.com
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Ingo Rockel
A grep for "too many processes" didn't reveal anything. The process got 
killed by the oom-killer.


Am 04.04.2013 16:12, schrieb Evan Vigil-McClanahan:

That's odd.  It was getting killed by the OOM killer, or crashing
because it couldn't allocate more memory?  That's suggestive of
something else that's wrong, since the +P doesn't do any memory
limiting.  Are you getting 'too many processes' emulator errors on
that node?

On Thu, Apr 4, 2013 at 8:47 AM, Ingo Rockel
 wrote:

the crashing node seems to be caused by the raised +P param, after last
crash I commented the param and now the node runs just fine.

Am 04.04.2013 15:43, schrieb Ingo Rockel:


Hi Evan,

we added monitoring of the object sizes and there was one object on one
of the three nodes mentioned which was > 2GB!!

We just changed the application code to get the id of this object to be
able to delete it. But is does happen only about once a day.

We right now have another node constantly crashing with oom about 12
minutes after start (always the same time frame), could this be related
to the big object issue? It is not one of the three nodes. The node logs
a lot of handoff receiving is going on.

Again, thanks for the help!

Regards,

  Ingo

Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:


If it's always the same three nodes it could well be same very large
object being updated each day.  Is there anything else that looks
suspicious in your logs?  Another sign of large objects is large_heap
(or long_gc) messages from riak_sysmon.

On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
 wrote:


Hi Evan,

thanks for all the infos! I adjusted the leveldb-config as suggested,
except
the cache, which I reduced to 16MB, keeping this above the default
helped a
lot at least during load testing. And I added +P 130072 to the
vm.args. Will
be applied to the riak nodes the next hours.

We have a monitoring using zabbbix, but haven't included the object
sizes so
far, will be added today.

We double-checked the Linux-Performance-Doc to be sure everything is
applied
to the nodes, especially as the problems always are caused from the same
three nodes. But everything looks fine.

Ingo

Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:


Another engineer mentions that you posted your eleveldb section and I
totally missed it:

The eleveldb section:

%% eLevelDB Config
{eleveldb, [
{data_root, "/var/lib/riak/leveldb"},
{cache_size, 33554432},
{write_buffer_size_min, 67108864}, %% 64 MB in bytes
{write_buffer_size_max, 134217728}, %% 128 MB in bytes
{max_open_files, 4000}
   ]},

This is likely going to make you unhappy as time goes on; Since all of
those settings are per-vnode, your max memory utilization is well
beyond your physical memory.  I'd remove the tunings for the caches
and buffers and drop max open files to 500, perhaps.  Make sure that
you've followed everything in:
http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
etc.

On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
 wrote:



Again, all of these things are signs of large objects, so if you could
track the object_size stats on the cluster, I think that we might see
something.  Even if you have no monitoring, a simple shell script
curling /stats/ on each node once a minute should do the job for a day
or two.

On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
 wrote:



We just had it again (around this time of the day we have our highest
user
activity).

I will set +P to 131072 tomorrow, anything else I should check or
change?

What about this memory-high-watermark which I get sporadically?

Ingo

Am 03.04.2013 17:57, schrieb Evan Vigil-McClanahan:


As for +P it's been raised in R16 (which is on the current man page)
on R15 it's only 32k.

The behavior that you're describing does sound like a very large
object getting put into the cluster (which may cause backups and
push
you up against the process limit, could have caused scheduler
collapse
on 1.2, etc.).

On Wed, Apr 3, 2013 at 8:39 AM, Ingo Rockel
 wrote:




Evan,

sys_process_count is somewhere between 5k and 11k on the nodes
right
now.
Concerning your suggested +P config, according to the
erlang-docs, the
default for this param already is 262144, so setting it to 655536
would
in
fact lower it?

We chose the ring size to be able to handle growth which was the
main
reason
to switch from mysql to nosql/riak. We have 12 Nodes, so about 86
vnodes
per
node.

No, we don't monitor object sizes, the majority of objects is very
small
(below 200 bytes), but we have objects storing references to this
small
objects which might grow to a few megabytes in size, most of
these are
paged
though and should not exceed one megabyte. Only one type is not
paged
(implementation reasons).

The outgoing/incoming traffic constantly is around 100 Mbit, if the
peformance drops happen, we suddenly see spikes up to 1GBit. And
these
spikes constantly happ

Re: simulate a transaction

2013-04-04 Thread Guido Medina
At least on two-phase commit enabled environment you can implement the 
rollback to "undo" your action, you expect things to go right and a very 
small % to go wrong, so implementing a rollback policy isn't such a bad 
idea, I had to do the same years ago for a payment client, when things 
went wrong on our DB, we send a revert payment, since once the payment 
was gone, it had no transactional behaviour so we had to implement a 
rollback policy.


Hope that helps,

Guido.

On 04/04/13 15:01, Sean Cribbs wrote:

Brisa,

You cannot simulate transactions, really (see also
http://aphyr.com/posts/254-burn-the-library). However, if you want to
receive notifications and take action when something happens, you can
add a postcommit hook (see
http://docs.basho.com/riak/latest/references/appendices/concepts/Commit-Hooks/).
In the past, developers have added hooks that push updates into a
RabbitMQ broker or ZeroMQ channel.

On Wed, Apr 3, 2013 at 7:59 PM, Brisa Jiménez  wrote:

Hi,
I want to simulate a transaction.
I want to know when a riak operation happens.
I know that riak isn't a relational database, but it is important for me in
particular cases to undo operations when something goes wrong.
Any have any suggestion for me about I want to do?
Thank you very much.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com







___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: simulate a transaction

2013-04-04 Thread Guido Medina
Also, you might not need a notification when the Riak operation succeed 
if you set a default high N value, let's say, for 5 nodes, N value = 3 
should give you a good safe bet, meaning, Riak client will return 
successfully when the key was written to at least 3 nodes, so it only 
leaves with the rollback policy implementation.


I know I'm suggesting the opposite (assume things went OK and take 
action when they go wrong)

If that makes sense for your application.

Guido.

On 04/04/13 15:33, Guido Medina wrote:
At least on two-phase commit enabled environment you can implement the 
rollback to "undo" your action, you expect things to go right and a 
very small % to go wrong, so implementing a rollback policy isn't such 
a bad idea, I had to do the same years ago for a payment client, when 
things went wrong on our DB, we send a revert payment, since once the 
payment was gone, it had no transactional behaviour so we had to 
implement a rollback policy.


Hope that helps,

Guido.

On 04/04/13 15:01, Sean Cribbs wrote:

Brisa,

You cannot simulate transactions, really (see also
http://aphyr.com/posts/254-burn-the-library). However, if you want to
receive notifications and take action when something happens, you can
add a postcommit hook (see
http://docs.basho.com/riak/latest/references/appendices/concepts/Commit-Hooks/). 


In the past, developers have added hooks that push updates into a
RabbitMQ broker or ZeroMQ channel.

On Wed, Apr 3, 2013 at 7:59 PM, Brisa Jiménez 
 wrote:

Hi,
I want to simulate a transaction.
I want to know when a riak operation happens.
I know that riak isn't a relational database, but it is important 
for me in

particular cases to undo operations when something goes wrong.
Any have any suggestion for me about I want to do?
Thank you very much.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com









___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: [ANNC] Riak 1.3.1

2013-04-04 Thread Dave Brady


Hi Jared, 


I don't see these patches, which I have applied to our installation of 1.3, 
explicitly mentioned in the Release Notes: 


Fix bug where stats endpoints were calculating _all_ riak_kv stats: 
https://github.com/basho/riak_kv/blob/9be3405e53acf680928faa6c70d265e86c75a22c/src/riak_kv_stat_bc.erl
 



Every read triggers a read-repair when Last-write-wins=true 
https://github.com/basho/riak_kv/pull/334 


Can you confirm whether or not they made it into 1.3.1, please? 

-- 
Dave Brady 

- Original Message - 
From: "Jared Morrow"  
To: "Riak Users Mailing List"  
Sent: Wednesday, April 3, 2013 11:53:45 PM GMT +01:00 Amsterdam / Berlin / Bern 
/ Rome / Stockholm / Vienna 
Subject: Re: [ANNC] Riak 1.3.1 

I hesitate to reply to my own email, but just wanted to point out that this 
issue https://github.com/basho/riak_core/pull/281 listed in the release notes 
should help all of you who had issues with slow bitcask startup times in 1.3.0. 
If you see or don't see improvements let us know. 


Thanks, 
Jared 


On Wed, Apr 3, 2013 at 3:16 PM, Jared Morrow < ja...@basho.com > wrote: 



Riak Users, 


We are happy to announce that Riak 1.3.1 is ready for your to download and 
install. It continues on the 1.3.x family with some nice bugfixes. See the 
release notes linked below for all the details. 


Release notes can be found here: 
https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.md 

Downloads available on our docs page: 
http://docs.basho.com/riak/1.3.1/downloads/ 


Thanks as always for being the best community in open source, 
-Everyone at Basho 

___ riak-users mailing list 
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Evan Vigil-McClanahan
One last note for 1.3.  Please make sure that the following line is in
your vm.args:
-env ERL_MAX_ETS_TABLES 819

This is a good idea for all systems but is especially important for
people with large rings.

Were there any other messages?  Riak constantly spawns new processes,
but they don't tend to build up unless the backend is misbehaving (or
a few other less likely conditions), and a backup of spawned processes
is the only thing I can think of that would make +P help with OOM
issues.

On Thu, Apr 4, 2013 at 9:21 AM, Ingo Rockel
 wrote:
> A grep for "too many processes" didn't reveal anything. The process got
> killed by the oom-killer.
>
> Am 04.04.2013 16:12, schrieb Evan Vigil-McClanahan:
>
>> That's odd.  It was getting killed by the OOM killer, or crashing
>> because it couldn't allocate more memory?  That's suggestive of
>> something else that's wrong, since the +P doesn't do any memory
>> limiting.  Are you getting 'too many processes' emulator errors on
>> that node?
>>
>> On Thu, Apr 4, 2013 at 8:47 AM, Ingo Rockel
>>  wrote:
>>>
>>> the crashing node seems to be caused by the raised +P param, after last
>>> crash I commented the param and now the node runs just fine.
>>>
>>> Am 04.04.2013 15:43, schrieb Ingo Rockel:
>>>
 Hi Evan,

 we added monitoring of the object sizes and there was one object on one
 of the three nodes mentioned which was > 2GB!!

 We just changed the application code to get the id of this object to be
 able to delete it. But is does happen only about once a day.

 We right now have another node constantly crashing with oom about 12
 minutes after start (always the same time frame), could this be related
 to the big object issue? It is not one of the three nodes. The node logs
 a lot of handoff receiving is going on.

 Again, thanks for the help!

 Regards,

   Ingo

 Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:
>
>
> If it's always the same three nodes it could well be same very large
> object being updated each day.  Is there anything else that looks
> suspicious in your logs?  Another sign of large objects is large_heap
> (or long_gc) messages from riak_sysmon.
>
> On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
>  wrote:
>>
>>
>> Hi Evan,
>>
>> thanks for all the infos! I adjusted the leveldb-config as suggested,
>> except
>> the cache, which I reduced to 16MB, keeping this above the default
>> helped a
>> lot at least during load testing. And I added +P 130072 to the
>> vm.args. Will
>> be applied to the riak nodes the next hours.
>>
>> We have a monitoring using zabbbix, but haven't included the object
>> sizes so
>> far, will be added today.
>>
>> We double-checked the Linux-Performance-Doc to be sure everything is
>> applied
>> to the nodes, especially as the problems always are caused from the
>> same
>> three nodes. But everything looks fine.
>>
>> Ingo
>>
>> Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:
>>
>>> Another engineer mentions that you posted your eleveldb section and I
>>> totally missed it:
>>>
>>> The eleveldb section:
>>>
>>> %% eLevelDB Config
>>> {eleveldb, [
>>> {data_root, "/var/lib/riak/leveldb"},
>>> {cache_size, 33554432},
>>> {write_buffer_size_min, 67108864}, %% 64 MB in bytes
>>> {write_buffer_size_max, 134217728}, %% 128 MB in
>>> bytes
>>> {max_open_files, 4000}
>>>]},
>>>
>>> This is likely going to make you unhappy as time goes on; Since all
>>> of
>>> those settings are per-vnode, your max memory utilization is well
>>> beyond your physical memory.  I'd remove the tunings for the caches
>>> and buffers and drop max open files to 500, perhaps.  Make sure that
>>> you've followed everything in:
>>>
>>> http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
>>> etc.
>>>
>>> On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
>>>  wrote:



 Again, all of these things are signs of large objects, so if you
 could
 track the object_size stats on the cluster, I think that we might
 see
 something.  Even if you have no monitoring, a simple shell script
 curling /stats/ on each node once a minute should do the job for a
 day
 or two.

 On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
  wrote:
>
>
>
> We just had it again (around this time of the day we have our
> highest
> user
> activity).
>
> I will set +P to 131072 tomorrow, anything else I should check or
> change?
>
> What about this me

Re: Comparison against DynamoDB

2013-04-04 Thread Evan Vigil-McClanahan
I can't speak to the costing issues, as that isn't something I am
terribly familiar with, but at the moment, riak still has some
overhead issues with very small values.   There are upcoming
optimizations in the next major (1.4) release that should help.  What
issues did you run into?

On Thu, Apr 4, 2013 at 8:06 AM, David Koblas  wrote:
> Spent some time with the AWS folks the other day and was getting sold on
> using DynamoDB for some of our large Key Value store needs. However given
> the read/write economics of DynamoDB vs. Instance+Storage costs on Riak I
> was wondering if anybody has done a good thinking around where the cost
> inflections points are?
>
> Also before I go and benchmark things how does Riak perform with 2B entries
> which are < 1K in size, when I last did the benchmarks - just before 1.0 -
> there were a few issues.
>
> Thanks,
> David
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: error running Erlang m/r job

2013-04-04 Thread Evan Vigil-McClanahan
As of 1.3 the old client:mapreduce is deprecated, please use
`riak_kv_mrc_pipe:mapred` instead.

On Thu, Apr 4, 2013 at 9:07 AM, Tom Zeng  wrote:
> Hi everyone,
>
> I am trying to run the Erlang m/r following the Riak Handbook, and got the
> following error:
>
> (riak@127.0.0.1)4> ExtractTweet = fun(RObject, _, _) ->
> (riak@127.0.0.1)4>  {struct, Obj} = mochijson2:decode(
> (riak@127.0.0.1)4>riak_object:get_value(RObject)),
> (riak@127.0.0.1)4>  [proplists:get_value(<<"tweet">>, Obj)]
> (riak@127.0.0.1)4> end.
> #Fun
> (riak@127.0.0.1)5> C:mapred([{<<"tweets">>, <<"41399579391950848">>}],
> (riak@127.0.0.1)5>   [{map, {qfun, ExtractTweet}, none, true}]).
> ** exception error: undefined function riak_client:mapred/3
>
> I've been using JavaScript for m/r and just started using Erlang per Basho
> engineers' recommendation at Riak DC meetup. Any help/pointers appreciated.
>
> Thanks
> Tom
> --
> Tom Zeng
> Director of Engineering
> Intridea, Inc. | www.intridea.com
> t...@intridea.com
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Evan Vigil-McClanahan
Major error on my part here!

> your vm.args:
> -env ERL_MAX_ETS_TABLES 819

This should be

-env ERL_MAX_ETS_TABLES 8192

Sorry for the sloppy cut and paste.  Please do not do the former
thing, or it will be very bad.

> This is a good idea for all systems but is especially important for
> people with large rings.
>
> Were there any other messages?  Riak constantly spawns new processes,
> but they don't tend to build up unless the backend is misbehaving (or
> a few other less likely conditions), and a backup of spawned processes
> is the only thing I can think of that would make +P help with OOM
> issues.
>
> On Thu, Apr 4, 2013 at 9:21 AM, Ingo Rockel
>  wrote:
>> A grep for "too many processes" didn't reveal anything. The process got
>> killed by the oom-killer.
>>
>> Am 04.04.2013 16:12, schrieb Evan Vigil-McClanahan:
>>
>>> That's odd.  It was getting killed by the OOM killer, or crashing
>>> because it couldn't allocate more memory?  That's suggestive of
>>> something else that's wrong, since the +P doesn't do any memory
>>> limiting.  Are you getting 'too many processes' emulator errors on
>>> that node?
>>>
>>> On Thu, Apr 4, 2013 at 8:47 AM, Ingo Rockel
>>>  wrote:

 the crashing node seems to be caused by the raised +P param, after last
 crash I commented the param and now the node runs just fine.

 Am 04.04.2013 15:43, schrieb Ingo Rockel:

> Hi Evan,
>
> we added monitoring of the object sizes and there was one object on one
> of the three nodes mentioned which was > 2GB!!
>
> We just changed the application code to get the id of this object to be
> able to delete it. But is does happen only about once a day.
>
> We right now have another node constantly crashing with oom about 12
> minutes after start (always the same time frame), could this be related
> to the big object issue? It is not one of the three nodes. The node logs
> a lot of handoff receiving is going on.
>
> Again, thanks for the help!
>
> Regards,
>
>   Ingo
>
> Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:
>>
>>
>> If it's always the same three nodes it could well be same very large
>> object being updated each day.  Is there anything else that looks
>> suspicious in your logs?  Another sign of large objects is large_heap
>> (or long_gc) messages from riak_sysmon.
>>
>> On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
>>  wrote:
>>>
>>>
>>> Hi Evan,
>>>
>>> thanks for all the infos! I adjusted the leveldb-config as suggested,
>>> except
>>> the cache, which I reduced to 16MB, keeping this above the default
>>> helped a
>>> lot at least during load testing. And I added +P 130072 to the
>>> vm.args. Will
>>> be applied to the riak nodes the next hours.
>>>
>>> We have a monitoring using zabbbix, but haven't included the object
>>> sizes so
>>> far, will be added today.
>>>
>>> We double-checked the Linux-Performance-Doc to be sure everything is
>>> applied
>>> to the nodes, especially as the problems always are caused from the
>>> same
>>> three nodes. But everything looks fine.
>>>
>>> Ingo
>>>
>>> Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:
>>>
 Another engineer mentions that you posted your eleveldb section and I
 totally missed it:

 The eleveldb section:

 %% eLevelDB Config
 {eleveldb, [
 {data_root, "/var/lib/riak/leveldb"},
 {cache_size, 33554432},
 {write_buffer_size_min, 67108864}, %% 64 MB in bytes
 {write_buffer_size_max, 134217728}, %% 128 MB in
 bytes
 {max_open_files, 4000}
]},

 This is likely going to make you unhappy as time goes on; Since all
 of
 those settings are per-vnode, your max memory utilization is well
 beyond your physical memory.  I'd remove the tunings for the caches
 and buffers and drop max open files to 500, perhaps.  Make sure that
 you've followed everything in:

 http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
 etc.

 On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
  wrote:
>
>
>
> Again, all of these things are signs of large objects, so if you
> could
> track the object_size stats on the cluster, I think that we might
> see
> something.  Even if you have no monitoring, a simple shell script
> curling /stats/ on each node once a minute should do the job for a
> day
> or two.
>
> On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
>  wrote:
>>
>>
>>
>> We just had i

Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Ingo Rockel
thanks, but it was a very obvious c&p error :) and we already have the 
ERL_MAX_ETS_TABLES set to 8192 as it is in the default vm.args.


The only other messages were about a lot of handoff going on.

Maybe the node was getting some data concerning the 2GB object?

Ingo

Am 04.04.2013 17:25, schrieb Evan Vigil-McClanahan:

Major error on my part here!


your vm.args:
-env ERL_MAX_ETS_TABLES 819


This should be

-env ERL_MAX_ETS_TABLES 8192

Sorry for the sloppy cut and paste.  Please do not do the former
thing, or it will be very bad.


This is a good idea for all systems but is especially important for
people with large rings.

Were there any other messages?  Riak constantly spawns new processes,
but they don't tend to build up unless the backend is misbehaving (or
a few other less likely conditions), and a backup of spawned processes
is the only thing I can think of that would make +P help with OOM
issues.

On Thu, Apr 4, 2013 at 9:21 AM, Ingo Rockel
 wrote:

A grep for "too many processes" didn't reveal anything. The process got
killed by the oom-killer.

Am 04.04.2013 16:12, schrieb Evan Vigil-McClanahan:


That's odd.  It was getting killed by the OOM killer, or crashing
because it couldn't allocate more memory?  That's suggestive of
something else that's wrong, since the +P doesn't do any memory
limiting.  Are you getting 'too many processes' emulator errors on
that node?

On Thu, Apr 4, 2013 at 8:47 AM, Ingo Rockel
 wrote:


the crashing node seems to be caused by the raised +P param, after last
crash I commented the param and now the node runs just fine.

Am 04.04.2013 15:43, schrieb Ingo Rockel:


Hi Evan,

we added monitoring of the object sizes and there was one object on one
of the three nodes mentioned which was > 2GB!!

We just changed the application code to get the id of this object to be
able to delete it. But is does happen only about once a day.

We right now have another node constantly crashing with oom about 12
minutes after start (always the same time frame), could this be related
to the big object issue? It is not one of the three nodes. The node logs
a lot of handoff receiving is going on.

Again, thanks for the help!

Regards,

   Ingo

Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:



If it's always the same three nodes it could well be same very large
object being updated each day.  Is there anything else that looks
suspicious in your logs?  Another sign of large objects is large_heap
(or long_gc) messages from riak_sysmon.

On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
 wrote:



Hi Evan,

thanks for all the infos! I adjusted the leveldb-config as suggested,
except
the cache, which I reduced to 16MB, keeping this above the default
helped a
lot at least during load testing. And I added +P 130072 to the
vm.args. Will
be applied to the riak nodes the next hours.

We have a monitoring using zabbbix, but haven't included the object
sizes so
far, will be added today.

We double-checked the Linux-Performance-Doc to be sure everything is
applied
to the nodes, especially as the problems always are caused from the
same
three nodes. But everything looks fine.

Ingo

Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:


Another engineer mentions that you posted your eleveldb section and I
totally missed it:

The eleveldb section:

 %% eLevelDB Config
 {eleveldb, [
 {data_root, "/var/lib/riak/leveldb"},
 {cache_size, 33554432},
 {write_buffer_size_min, 67108864}, %% 64 MB in bytes
 {write_buffer_size_max, 134217728}, %% 128 MB in
bytes
 {max_open_files, 4000}
]},

This is likely going to make you unhappy as time goes on; Since all
of
those settings are per-vnode, your max memory utilization is well
beyond your physical memory.  I'd remove the tunings for the caches
and buffers and drop max open files to 500, perhaps.  Make sure that
you've followed everything in:

http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
etc.

On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
 wrote:




Again, all of these things are signs of large objects, so if you
could
track the object_size stats on the cluster, I think that we might
see
something.  Even if you have no monitoring, a simple shell script
curling /stats/ on each node once a minute should do the job for a
day
or two.

On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
 wrote:




We just had it again (around this time of the day we have our
highest
user
activity).

I will set +P to 131072 tomorrow, anything else I should check or
change?

What about this memory-high-watermark which I get sporadically?

Ingo

Am 03.04.2013 17:57, schrieb Evan Vigil-McClanahan:


As for +P it's been raised in R16 (which is on the current man
page)
on R15 it's only 32k.

The behavior that you're describing does sound like a very large
object getting put into the cluster (which may cause backups and
push
you up against the process limit, co

Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Evan Vigil-McClanahan
Possible, but would need more information to make a guess.  I'd keep a
close eye on that node.

On Thu, Apr 4, 2013 at 10:34 AM, Ingo Rockel
 wrote:
> thanks, but it was a very obvious c&p error :) and we already have the
> ERL_MAX_ETS_TABLES set to 8192 as it is in the default vm.args.
>
> The only other messages were about a lot of handoff going on.
>
> Maybe the node was getting some data concerning the 2GB object?
>
> Ingo
>
> Am 04.04.2013 17:25, schrieb Evan Vigil-McClanahan:
>
>> Major error on my part here!
>>
>>> your vm.args:
>>> -env ERL_MAX_ETS_TABLES 819
>>
>>
>> This should be
>>
>> -env ERL_MAX_ETS_TABLES 8192
>>
>> Sorry for the sloppy cut and paste.  Please do not do the former
>> thing, or it will be very bad.
>>
>>> This is a good idea for all systems but is especially important for
>>> people with large rings.
>>>
>>> Were there any other messages?  Riak constantly spawns new processes,
>>> but they don't tend to build up unless the backend is misbehaving (or
>>> a few other less likely conditions), and a backup of spawned processes
>>> is the only thing I can think of that would make +P help with OOM
>>> issues.
>>>
>>> On Thu, Apr 4, 2013 at 9:21 AM, Ingo Rockel
>>>  wrote:

 A grep for "too many processes" didn't reveal anything. The process got
 killed by the oom-killer.

 Am 04.04.2013 16:12, schrieb Evan Vigil-McClanahan:

> That's odd.  It was getting killed by the OOM killer, or crashing
> because it couldn't allocate more memory?  That's suggestive of
> something else that's wrong, since the +P doesn't do any memory
> limiting.  Are you getting 'too many processes' emulator errors on
> that node?
>
> On Thu, Apr 4, 2013 at 8:47 AM, Ingo Rockel
>  wrote:
>>
>>
>> the crashing node seems to be caused by the raised +P param, after
>> last
>> crash I commented the param and now the node runs just fine.
>>
>> Am 04.04.2013 15:43, schrieb Ingo Rockel:
>>
>>> Hi Evan,
>>>
>>> we added monitoring of the object sizes and there was one object on
>>> one
>>> of the three nodes mentioned which was > 2GB!!
>>>
>>> We just changed the application code to get the id of this object to
>>> be
>>> able to delete it. But is does happen only about once a day.
>>>
>>> We right now have another node constantly crashing with oom about 12
>>> minutes after start (always the same time frame), could this be
>>> related
>>> to the big object issue? It is not one of the three nodes. The node
>>> logs
>>> a lot of handoff receiving is going on.
>>>
>>> Again, thanks for the help!
>>>
>>> Regards,
>>>
>>>Ingo
>>>
>>> Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:



 If it's always the same three nodes it could well be same very large
 object being updated each day.  Is there anything else that looks
 suspicious in your logs?  Another sign of large objects is
 large_heap
 (or long_gc) messages from riak_sysmon.

 On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
  wrote:
>
>
>
> Hi Evan,
>
> thanks for all the infos! I adjusted the leveldb-config as
> suggested,
> except
> the cache, which I reduced to 16MB, keeping this above the default
> helped a
> lot at least during load testing. And I added +P 130072 to the
> vm.args. Will
> be applied to the riak nodes the next hours.
>
> We have a monitoring using zabbbix, but haven't included the object
> sizes so
> far, will be added today.
>
> We double-checked the Linux-Performance-Doc to be sure everything
> is
> applied
> to the nodes, especially as the problems always are caused from the
> same
> three nodes. But everything looks fine.
>
> Ingo
>
> Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:
>
>> Another engineer mentions that you posted your eleveldb section
>> and I
>> totally missed it:
>>
>> The eleveldb section:
>>
>>  %% eLevelDB Config
>>  {eleveldb, [
>>  {data_root, "/var/lib/riak/leveldb"},
>>  {cache_size, 33554432},
>>  {write_buffer_size_min, 67108864}, %% 64 MB in
>> bytes
>>  {write_buffer_size_max, 134217728}, %% 128 MB in
>> bytes
>>  {max_open_files, 4000}
>> ]},
>>
>> This is likely going to make you unhappy as time goes on; Since
>> all
>> of
>> those settings are per-vnode, your max memory utilization is well
>> beyond your physical memory.  I'd remove th

Re: [ANNC] Riak 1.3.1

2013-04-04 Thread Engel Sanchez
Hi Dave,

The stats calculation was fixed in 1.3.1, but the read-repair with
Last-write-wins=true was not backported. That one will make it to 1.4,
which is scheduled in the near future. I hope that helps.

-- 
Engel Sanchez

On Thu, Apr 4, 2013 at 11:05 AM, Dave Brady  wrote:

> Hi Jared,
>
> I don't see these patches, which I have applied to our installation of
> 1.3, explicitly mentioned in the Release Notes:
>
> Fix bug where stats endpoints were calculating _all_ riak_kv stats:
>
> https://github.com/basho/riak_kv/blob/9be3405e53acf680928faa6c70d265e86c75a22c/src/riak_kv_stat_bc.erl
>
> Every read triggers a read-repair when Last-write-wins=true
> https://github.com/basho/riak_kv/pull/334
>
> Can you confirm whether or not they made it into 1.3.1, please?
>
> --
> Dave Brady
>
>
> - Original Message -
> From: "Jared Morrow" 
> To: "Riak Users Mailing List" 
> Sent: Wednesday, April 3, 2013 11:53:45 PM GMT +01:00 Amsterdam / Berlin /
> Bern / Rome / Stockholm / Vienna
> Subject: Re: [ANNC] Riak 1.3.1
>
> I hesitate to reply to my own email, but just wanted to point out that
> this issue https://github.com/basho/riak_core/pull/281 listed in the
> release notes should help all of you who had issues with slow bitcask
> startup times in 1.3.0.  If you see or don't see improvements let us know.
>
> Thanks,
> Jared
>
> On Wed, Apr 3, 2013 at 3:16 PM, Jared Morrow  wrote:
>
>> Riak Users,
>>
>> We are happy to announce that Riak 1.3.1 is ready for your to download
>> and install.  It continues on the 1.3.x family with some nice bugfixes.
>>  See the release notes linked below for all the details.
>>
>> Release notes can be found here:
>> https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.md
>>
>> Downloads available on our docs page:
>> http://docs.basho.com/riak/1.3.1/downloads/
>>
>> Thanks as always for being the best community in open source,
>> -Everyone at Basho
>>
>
>
> ___ riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: [ANNC] Riak 1.3.1

2013-04-04 Thread Jordan West
Hi Dave,

Building on what Engel said, the stats change was merged as part of a
larger squashed commit [1].

Jordan

[1]
https://github.com/basho/riak_kv/commit/fd2e527378a7fa284605b131c4d02ee5c28d229d

On Thu, Apr 4, 2013 at 9:00 AM, Engel Sanchez  wrote:

> Hi Dave,
>
> The stats calculation was fixed in 1.3.1, but the read-repair with
> Last-write-wins=true was not backported. That one will make it to 1.4,
> which is scheduled in the near future. I hope that helps.
>
> --
> Engel Sanchez
>
>
> On Thu, Apr 4, 2013 at 11:05 AM, Dave Brady  wrote:
>
>> Hi Jared,
>>
>> I don't see these patches, which I have applied to our installation of
>> 1.3, explicitly mentioned in the Release Notes:
>>
>> Fix bug where stats endpoints were calculating _all_ riak_kv stats:
>>
>> https://github.com/basho/riak_kv/blob/9be3405e53acf680928faa6c70d265e86c75a22c/src/riak_kv_stat_bc.erl
>>
>> Every read triggers a read-repair when Last-write-wins=true
>> https://github.com/basho/riak_kv/pull/334
>>
>> Can you confirm whether or not they made it into 1.3.1, please?
>>
>> --
>> Dave Brady
>>
>>
>> - Original Message -
>> From: "Jared Morrow" 
>> To: "Riak Users Mailing List" 
>> Sent: Wednesday, April 3, 2013 11:53:45 PM GMT +01:00 Amsterdam / Berlin
>> / Bern / Rome / Stockholm / Vienna
>> Subject: Re: [ANNC] Riak 1.3.1
>>
>> I hesitate to reply to my own email, but just wanted to point out that
>> this issue https://github.com/basho/riak_core/pull/281 listed in the
>> release notes should help all of you who had issues with slow bitcask
>> startup times in 1.3.0.  If you see or don't see improvements let us know.
>>
>> Thanks,
>> Jared
>>
>> On Wed, Apr 3, 2013 at 3:16 PM, Jared Morrow  wrote:
>>
>>> Riak Users,
>>>
>>> We are happy to announce that Riak 1.3.1 is ready for your to download
>>> and install.  It continues on the 1.3.x family with some nice bugfixes.
>>>  See the release notes linked below for all the details.
>>>
>>> Release notes can be found here:
>>> https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.md
>>>
>>> Downloads available on our docs page:
>>> http://docs.basho.com/riak/1.3.1/downloads/
>>>
>>> Thanks as always for being the best community in open source,
>>> -Everyone at Basho
>>>
>>
>>
>> ___ riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Warning "Can not start proc_lib:init_p"

2013-04-04 Thread Ingo Rockel
Thanks a lot for pointing into the right direction (the huge object), 
would have taken a lot longer for me to find out myself!


Am 04.04.2013 17:51, schrieb Evan Vigil-McClanahan:

Possible, but would need more information to make a guess.  I'd keep a
close eye on that node.

On Thu, Apr 4, 2013 at 10:34 AM, Ingo Rockel
 wrote:

thanks, but it was a very obvious c&p error :) and we already have the
ERL_MAX_ETS_TABLES set to 8192 as it is in the default vm.args.

The only other messages were about a lot of handoff going on.

Maybe the node was getting some data concerning the 2GB object?

Ingo

Am 04.04.2013 17:25, schrieb Evan Vigil-McClanahan:


Major error on my part here!


your vm.args:
-env ERL_MAX_ETS_TABLES 819



This should be

-env ERL_MAX_ETS_TABLES 8192

Sorry for the sloppy cut and paste.  Please do not do the former
thing, or it will be very bad.


This is a good idea for all systems but is especially important for
people with large rings.

Were there any other messages?  Riak constantly spawns new processes,
but they don't tend to build up unless the backend is misbehaving (or
a few other less likely conditions), and a backup of spawned processes
is the only thing I can think of that would make +P help with OOM
issues.

On Thu, Apr 4, 2013 at 9:21 AM, Ingo Rockel
 wrote:


A grep for "too many processes" didn't reveal anything. The process got
killed by the oom-killer.

Am 04.04.2013 16:12, schrieb Evan Vigil-McClanahan:


That's odd.  It was getting killed by the OOM killer, or crashing
because it couldn't allocate more memory?  That's suggestive of
something else that's wrong, since the +P doesn't do any memory
limiting.  Are you getting 'too many processes' emulator errors on
that node?

On Thu, Apr 4, 2013 at 8:47 AM, Ingo Rockel
 wrote:



the crashing node seems to be caused by the raised +P param, after
last
crash I commented the param and now the node runs just fine.

Am 04.04.2013 15:43, schrieb Ingo Rockel:


Hi Evan,

we added monitoring of the object sizes and there was one object on
one
of the three nodes mentioned which was > 2GB!!

We just changed the application code to get the id of this object to
be
able to delete it. But is does happen only about once a day.

We right now have another node constantly crashing with oom about 12
minutes after start (always the same time frame), could this be
related
to the big object issue? It is not one of the three nodes. The node
logs
a lot of handoff receiving is going on.

Again, thanks for the help!

Regards,

Ingo

Am 04.04.2013 15:30, schrieb Evan Vigil-McClanahan:




If it's always the same three nodes it could well be same very large
object being updated each day.  Is there anything else that looks
suspicious in your logs?  Another sign of large objects is
large_heap
(or long_gc) messages from riak_sysmon.

On Thu, Apr 4, 2013 at 3:58 AM, Ingo Rockel
 wrote:




Hi Evan,

thanks for all the infos! I adjusted the leveldb-config as
suggested,
except
the cache, which I reduced to 16MB, keeping this above the default
helped a
lot at least during load testing. And I added +P 130072 to the
vm.args. Will
be applied to the riak nodes the next hours.

We have a monitoring using zabbbix, but haven't included the object
sizes so
far, will be added today.

We double-checked the Linux-Performance-Doc to be sure everything
is
applied
to the nodes, especially as the problems always are caused from the
same
three nodes. But everything looks fine.

Ingo

Am 03.04.2013 18:42, schrieb Evan Vigil-McClanahan:


Another engineer mentions that you posted your eleveldb section
and I
totally missed it:

The eleveldb section:

  %% eLevelDB Config
  {eleveldb, [
  {data_root, "/var/lib/riak/leveldb"},
  {cache_size, 33554432},
  {write_buffer_size_min, 67108864}, %% 64 MB in
bytes
  {write_buffer_size_max, 134217728}, %% 128 MB in
bytes
  {max_open_files, 4000}
 ]},

This is likely going to make you unhappy as time goes on; Since
all
of
those settings are per-vnode, your max memory utilization is well
beyond your physical memory.  I'd remove the tunings for the
caches
and buffers and drop max open files to 500, perhaps.  Make sure
that
you've followed everything in:


http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/,
etc.

On Wed, Apr 3, 2013 at 9:33 AM, Evan Vigil-McClanahan
 wrote:





Again, all of these things are signs of large objects, so if you
could
track the object_size stats on the cluster, I think that we might
see
something.  Even if you have no monitoring, a simple shell script
curling /stats/ on each node once a minute should do the job for
a
day
or two.

On Wed, Apr 3, 2013 at 9:29 AM, Ingo Rockel
 wrote:





We just had it again (around this time of the day we have our
highest
user
activity).

I will set +P to 131072 tomorrow, anything else I should check
or
change?

What about this memory-high-watermark w

[ANNC] Riak CS 1.3.1

2013-04-04 Thread Jared Morrow
Riak Users,

To keep everyone on their toes with our Riak 1.3.1 release yesterday, today
we have an update to Riak CS in the form of 1.3.1.  Riak CS and Stanchion
have both been updated.  There is no update to Riak CS Control at this time.

The downloads can be found on our docs page:
http://docs.basho.com/riakcs/1.3.1/riakcs-downloads/

The release notes are available here:

   - Riak CS
   https://github.com/basho/riak_cs/blob/release/1.3/RELEASE-NOTES.org
   - Stanchion
   https://github.com/basho/stanchion/blob/release/1.3/RELEASE-NOTES.org


Thanks,
- Everyone at Basho
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


MapReduce: map then inputs

2013-04-04 Thread Jimmy Ho
Hi guys,

I am sure I've seen it somewhere explaining what I'd like to do but can no
longer find the link, hope someone could help?  Thanks.

I have a 'user' bucket which stores a list of keys within the data,
pointing to other users as friends...

ie
For User "mathew"

data: {
"friends": [ "john", "mark", "luke" ]
}


How would I get the keys of mathew's friends' friends via map reduce?

inputs: [["user", "mathew"]],
query: [ {map:... get a list of [[bucket, friend_key]] }

What is the next phase to read the bucket/key values as the new inputs?

Thanks guys,
Regards, Jimmy
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Exploring Riak, need to confirm throughput

2013-04-04 Thread Matthew MacClary
Thanks for the feedback. I made two changes to my test setup and saw better
throughput:

1) Don't write to the same key over and over. Updating a key appears to be
a lot slower than creating a new key

2) I used parallel PUTs

The throughput I was measuring before was about 26MB/s on localhost. With
these changes it went to around 200MB/s on a disk that can write at about
480MB/s. That is more the type of performance I need for the data store we
have in mind. I am going to proceed with testing on 8 nodes with RAID0
drives.

Here are some details of the testing I did if it will help others. I tried
the test with 1MB, 10MB, and 20MB binary data. I didn't notice a big signal
with regard to larger objects slowing things down.

wget
http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/1.2/1.2.1/rhel/5/riak-1.2.1-1.el5.x86_64.rpm

sudo rpm -Uvh riak-1.2.1-1.el5.x86_64.rpm
/usr/sbin/riak start
mkdir data-dir && cd data-dir
seq -w 0 100 | parallel dd if=/dev/zero of={}.10meg bs=8k count=1280
http_proxy=   # don’t contact proxy
time find . -name \*.10meg | parallel -j8 -n1 wget --post-file {}
http://127.0.0.1:8098/riak/test1/{}

During these tests I saw beam.smp jumping to 350-550 while watching %CPU
under top. When I was seeing slower thoughput beam.smp was using much less
CPU.

Kind regards,

-Matt

On Wed, Apr 3, 2013 at 7:20 AM, Reid Draper  wrote:

> inline:
>
>
> On Apr 2, 2013, at 6:48 PM, Matthew MacClary <
> maccl...@lifetime.oregonstate.edu> wrote:
>
> Hi all, I am new to this list. Thanks for taking the time to read my
> questions! I just want to know if the data throughput I am seeing is
> expected for the bitcask backend or if it is too low.
>
> I am doing the preliminary feasibility study to decide if we should
> implement a Riak data store. Our application involves rendering chunks of
> data that range in size from about 1MB-9MB or so. This rendering work is
> CPU intensive so it is spread over a bunch of compute nodes which write the
> output into a data store.
>
>
> Riak is not intended to store objects of this size, not at the moment
> anyway. Riak CS [1], on the other hand, can store files up to several TB.
> That being said, Riak CS may or may not have other qualities  you desire.
> It's a known issue [2] that the Riak object size limitations should be
> better documented.
>
>
> After rendering, a second process consumes that data chunks from the data
> store at a rate of about 480MB/s in a streaming configuration so there is >
> 480MB/s of new data coming in at the same time the data is being read.
>
>
> Is this a single-socket, or is there some concurrency here?
>
>
> My testing so far involves a one node cluster on a dev box. What I wanted
> to show is that Riak writes were limited by the hard disk throughput. So
> far I haven't seen writes to localhost come anywhere close to the hard disk
> throughput:
>
> $ MYFILE=/tmp/output.png
> $ dd if=/dev/zero of=$MYFILE bs=8k count=256k
> 262144+0 records in
> 262144+0 records out
> 2147483648 bytes (2.1 GB) copied, 4.48906 seconds, 478 MB/s
> $ rm $MYFILE
>
> So the hard disk throughput is around 478MB/s for this simple write test.
>
> The next test I did was to load a 39MB binary file into my one node
> cluster. I used a script to do 12 POSTs with curl and 12 POSTSs with wget.
>
> curl --tcp-nodelay -XPOST http://${IP}:${PORT}/riak/test/file3 \
> -H "Content-Type:application/octet-stream" \
> --data-binary @${UPLOAD_FILE} \
> --write-out "%{speed_upload}\n"
>
> wget --post-file ${UPLOAD_FILE} http://127.0.0.1:8098/riak/test/file1
>
> What I found was that I could get only about 26MB/s with this command line
> testing. Does this seam about right? Should I see an 18x slow down over the
> write speed of the disk drive?
>
>
> Was this running the 24 (12 * 2) uploads in serial or parallel? With a
> single-threaded workload, you're unlikely to get Riak to be able to
> saturate a disk. Furthermore, there are design decisions in Riak at the
> moment that make it less than optimal for single objects of 39MB.
> Single-object high throughput (measured in MB) is more in the wheelhouse of
> Riak CS than Riak on it's own, which is primarily designed for low-latency
> and high-throughput (measured in ops/sec). One of the ways that Riak CS
> achieves this on top of Riak is by introducing concurrency between the
> end-user and Riak.
>
>
> Thanks for your comments on my application and test approach!
>
>
> Hope this helps,
> Reid
>
> [1] http://docs.basho.com/riakcs/latest/
> [2] https://github.com/basho/basho_docs/issues/256
>
>
>
> -Matt
>
> ---
> Dev Environment Details:
> dev box  running RHEL6.2, 12 cores, 48GB, 6Gb/s SAS 15k HD
> Riak 1.2.1 from
> http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/1.2/1.2.1/rhel/5/riak-1.2.1-1.el5.x86_64.rpm
> n_val=1
> r=1
> w=1
> backend=bitcask
>
> Deploy Environment Details:
>  Node to node bandwidth > 40Gb/s
>  similar config for node se

Re: Exploring Riak, need to confirm throughput

2013-04-04 Thread Shuhao
Just as a side note, you might want to retry the test with PBC. While I 
have only did testings with < 10kb documents, my tests indicates that 
PBC is twice as fast as HTTP in almost all cases.


Shuhao

On 13-04-04 04:14 PM, Matthew MacClary wrote:

Thanks for the feedback. I made two changes to my test setup and saw better
throughput:

1) Don't write to the same key over and over. Updating a key appears to be
a lot slower than creating a new key

2) I used parallel PUTs

The throughput I was measuring before was about 26MB/s on localhost. With
these changes it went to around 200MB/s on a disk that can write at about
480MB/s. That is more the type of performance I need for the data store we
have in mind. I am going to proceed with testing on 8 nodes with RAID0
drives.

Here are some details of the testing I did if it will help others. I tried
the test with 1MB, 10MB, and 20MB binary data. I didn't notice a big signal
with regard to larger objects slowing things down.

wget
http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/1.2/1.2.1/rhel/5/riak-1.2.1-1.el5.x86_64.rpm

sudo rpm -Uvh riak-1.2.1-1.el5.x86_64.rpm
/usr/sbin/riak start
mkdir data-dir && cd data-dir
seq -w 0 100 | parallel dd if=/dev/zero of={}.10meg bs=8k count=1280
http_proxy=   # don’t contact proxy
time find . -name \*.10meg | parallel -j8 -n1 wget --post-file {}
http://127.0.0.1:8098/riak/test1/{}

During these tests I saw beam.smp jumping to 350-550 while watching %CPU
under top. When I was seeing slower thoughput beam.smp was using much less
CPU.

Kind regards,

-Matt

On Wed, Apr 3, 2013 at 7:20 AM, Reid Draper  wrote:


inline:


On Apr 2, 2013, at 6:48 PM, Matthew MacClary <
maccl...@lifetime.oregonstate.edu> wrote:

Hi all, I am new to this list. Thanks for taking the time to read my
questions! I just want to know if the data throughput I am seeing is
expected for the bitcask backend or if it is too low.

I am doing the preliminary feasibility study to decide if we should
implement a Riak data store. Our application involves rendering chunks of
data that range in size from about 1MB-9MB or so. This rendering work is
CPU intensive so it is spread over a bunch of compute nodes which write the
output into a data store.


Riak is not intended to store objects of this size, not at the moment
anyway. Riak CS [1], on the other hand, can store files up to several TB.
That being said, Riak CS may or may not have other qualities  you desire.
It's a known issue [2] that the Riak object size limitations should be
better documented.


After rendering, a second process consumes that data chunks from the data
store at a rate of about 480MB/s in a streaming configuration so there is >
480MB/s of new data coming in at the same time the data is being read.


Is this a single-socket, or is there some concurrency here?


My testing so far involves a one node cluster on a dev box. What I wanted
to show is that Riak writes were limited by the hard disk throughput. So
far I haven't seen writes to localhost come anywhere close to the hard disk
throughput:

$ MYFILE=/tmp/output.png
$ dd if=/dev/zero of=$MYFILE bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 4.48906 seconds, 478 MB/s
$ rm $MYFILE

So the hard disk throughput is around 478MB/s for this simple write test.

The next test I did was to load a 39MB binary file into my one node
cluster. I used a script to do 12 POSTs with curl and 12 POSTSs with wget.

curl --tcp-nodelay -XPOST http://${IP}:${PORT}/riak/test/file3 \
 -H "Content-Type:application/octet-stream" \
 --data-binary @${UPLOAD_FILE} \
 --write-out "%{speed_upload}\n"

wget --post-file ${UPLOAD_FILE} http://127.0.0.1:8098/riak/test/file1

What I found was that I could get only about 26MB/s with this command line
testing. Does this seam about right? Should I see an 18x slow down over the
write speed of the disk drive?


Was this running the 24 (12 * 2) uploads in serial or parallel? With a
single-threaded workload, you're unlikely to get Riak to be able to
saturate a disk. Furthermore, there are design decisions in Riak at the
moment that make it less than optimal for single objects of 39MB.
Single-object high throughput (measured in MB) is more in the wheelhouse of
Riak CS than Riak on it's own, which is primarily designed for low-latency
and high-throughput (measured in ops/sec). One of the ways that Riak CS
achieves this on top of Riak is by introducing concurrency between the
end-user and Riak.


Thanks for your comments on my application and test approach!


Hope this helps,
Reid

[1] http://docs.basho.com/riakcs/latest/
[2] https://github.com/basho/basho_docs/issues/256



-Matt

---
Dev Environment Details:
dev box  running RHEL6.2, 12 cores, 48GB, 6Gb/s SAS 15k HD
Riak 1.2.1 from
http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/1.2/1.2.1/rhel/5/riak-1.2.1-1.el5.x86_64.rpm
n_val=1
r=1
w=1
backend=bitcask

User quota in riak-cs

2013-04-04 Thread minotaurus
 How can i enforce a quota for each user (tennant) in riak-cs ? Thanks. 

 
 



--
View this message in context: 
http://riak-users.197444.n3.nabble.com/User-quota-in-riak-cs-tp4027486.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: [ANNC] Riak 1.3.1

2013-04-04 Thread Dave Brady
Ok, thanks Engel and Jordan! 

-- 
Dave Brady 

- Original Message - 
From: "Jordan West"  
To: "Engel Sanchez"  
Cc: "Dave Brady" , "Riak Users Mailing List" 
 
Sent: Thursday, April 4, 2013 6:06:05 PM GMT +01:00 Amsterdam / Berlin / Bern / 
Rome / Stockholm / Vienna 
Subject: Re: [ANNC] Riak 1.3.1 

Hi Dave, 


Building on what Engel said, the stats change was merged as part of a larger 
squashed commit [1]. 


Jordan 


[1] 
https://github.com/basho/riak_kv/commit/fd2e527378a7fa284605b131c4d02ee5c28d229d
 


On Thu, Apr 4, 2013 at 9:00 AM, Engel Sanchez < en...@basho.com > wrote: 


Hi Dave, 


The stats calculation was fixed in 1.3.1, but the read-repair with 
Last-write-wins=true was not backported. That one will make it to 1.4, which is 
scheduled in the near future. I hope that helps. 


-- 
Engel Sanchez 




On Thu, Apr 4, 2013 at 11:05 AM, Dave Brady < dbr...@weborama.com > wrote: 






Hi Jared, 


I don't see these patches, which I have applied to our installation of 1.3, 
explicitly mentioned in the Release Notes: 


Fix bug where stats endpoints were calculating _all_ riak_kv stats: 
https://github.com/basho/riak_kv/blob/9be3405e53acf680928faa6c70d265e86c75a22c/src/riak_kv_stat_bc.erl
 



Every read triggers a read-repair when Last-write-wins=true 
https://github.com/basho/riak_kv/pull/334 


Can you confirm whether or not they made it into 1.3.1, please? 

-- 
Dave Brady 


- Original Message - 
From: "Jared Morrow" < ja...@basho.com > 
To: "Riak Users Mailing List" < riak-users@lists.basho.com > 
Sent: Wednesday, April 3, 2013 11:53:45 PM GMT +01:00 Amsterdam / Berlin / Bern 
/ Rome / Stockholm / Vienna 
Subject: Re: [ANNC] Riak 1.3.1 

I hesitate to reply to my own email, but just wanted to point out that this 
issue https://github.com/basho/riak_core/pull/281 listed in the release notes 
should help all of you who had issues with slow bitcask startup times in 1.3.0. 
If you see or don't see improvements let us know. 


Thanks, 
Jared 


On Wed, Apr 3, 2013 at 3:16 PM, Jared Morrow < ja...@basho.com > wrote: 



Riak Users, 


We are happy to announce that Riak 1.3.1 is ready for your to download and 
install. It continues on the 1.3.x family with some nice bugfixes. See the 
release notes linked below for all the details. 


Release notes can be found here: 
https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.md 

Downloads available on our docs page: 
http://docs.basho.com/riak/1.3.1/downloads/ 


Thanks as always for being the best community in open source, 
-Everyone at Basho 

___ riak-users mailing list 
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
___ 
riak-users mailing list 
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 



___ 
riak-users mailing list 
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Exploring Riak, need to confirm throughput

2013-04-04 Thread Reid Draper

On Apr 4, 2013, at 4:14 PM, Matthew MacClary 
 wrote:

> Thanks for the feedback. I made two changes to my test setup and saw better 
> throughput:
> 
> 1) Don't write to the same key over and over. Updating a key appears to be a 
> lot slower than creating a new key
> 
> 2) I used parallel PUTs
> 
> The throughput I was measuring before was about 26MB/s on localhost. With 
> these changes it went to around 200MB/s on a disk that can write at about 
> 480MB/s. That is more the type of performance I need for the data store we 
> have in mind. I am going to proceed with testing on 8 nodes with RAID0 drives.

How are you measuring throughput? HTTP throughput, or disk throughput with 
something like iostat?

> 
> Here are some details of the testing I did if it will help others. I tried 
> the test with 1MB, 10MB, and 20MB binary data. I didn't notice a big signal 
> with regard to larger objects slowing things down.

The issues with larger objects will likely only present themselves when you 
have more than one node.

> 
> wget 
> http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/1.2/1.2.1/rhel/5/riak-1.2.1-1.el5.x86_64.rpm
> 
> sudo rpm -Uvh riak-1.2.1-1.el5.x86_64.rpm
> /usr/sbin/riak start
> mkdir data-dir && cd data-dir
> seq -w 0 100 | parallel dd if=/dev/zero of={}.10meg bs=8k count=1280
> http_proxy=   # don’t contact proxy
> time find . -name \*.10meg | parallel -j8 -n1 wget --post-file {} 
> http://127.0.0.1:8098/riak/test1/{}
> 
> During these tests I saw beam.smp jumping to 350-550 while watching %CPU 
> under top. When I was seeing slower thoughput beam.smp was using much less 
> CPU.
> 
> Kind regards,
> 
> -Matt
> 
> On Wed, Apr 3, 2013 at 7:20 AM, Reid Draper  wrote:
> inline:
> 
> 
> On Apr 2, 2013, at 6:48 PM, Matthew MacClary 
>  wrote:
> 
>> Hi all, I am new to this list. Thanks for taking the time to read my 
>> questions! I just want to know if the data throughput I am seeing is 
>> expected for the bitcask backend or if it is too low.
>> 
>> I am doing the preliminary feasibility study to decide if we should 
>> implement a Riak data store. Our application involves rendering chunks of 
>> data that range in size from about 1MB-9MB or so. This rendering work is CPU 
>> intensive so it is spread over a bunch of compute nodes which write the 
>> output into a data store.
> 
> Riak is not intended to store objects of this size, not at the moment anyway. 
> Riak CS [1], on the other hand, can store files up to several TB. That being 
> said, Riak CS may or may not have other qualities  you desire. It's a known 
> issue [2] that the Riak object size limitations should be better documented.
> 
>> 
>> After rendering, a second process consumes that data chunks from the data 
>> store at a rate of about 480MB/s in a streaming configuration so there is > 
>> 480MB/s of new data coming in at the same time the data is being read.
> 
> Is this a single-socket, or is there some concurrency here?
> 
>> 
>> My testing so far involves a one node cluster on a dev box. What I wanted to 
>> show is that Riak writes were limited by the hard disk throughput. So far I 
>> haven't seen writes to localhost come anywhere close to the hard disk 
>> throughput:
>> 
>> $ MYFILE=/tmp/output.png
>> $ dd if=/dev/zero of=$MYFILE bs=8k count=256k
>> 262144+0 records in
>> 262144+0 records out
>> 2147483648 bytes (2.1 GB) copied, 4.48906 seconds, 478 MB/s
>> $ rm $MYFILE
>> 
>> So the hard disk throughput is around 478MB/s for this simple write test.
>> 
>> The next test I did was to load a 39MB binary file into my one node cluster. 
>> I used a script to do 12 POSTs with curl and 12 POSTSs with wget. 
>> 
>> curl --tcp-nodelay -XPOST http://${IP}:${PORT}/riak/test/file3 \
>> -H "Content-Type:application/octet-stream" \
>> --data-binary @${UPLOAD_FILE} \
>> --write-out "%{speed_upload}\n"
>> 
>> wget --post-file ${UPLOAD_FILE} http://127.0.0.1:8098/riak/test/file1
>> 
>> What I found was that I could get only about 26MB/s with this command line 
>> testing. Does this seam about right? Should I see an 18x slow down over the 
>> write speed of the disk drive?
> 
> Was this running the 24 (12 * 2) uploads in serial or parallel? With a 
> single-threaded workload, you're unlikely to get Riak to be able to saturate 
> a disk. Furthermore, there are design decisions in Riak at the moment that 
> make it less than optimal for single objects of 39MB. Single-object high 
> throughput (measured in MB) is more in the wheelhouse of Riak CS than Riak on 
> it's own, which is primarily designed for low-latency and high-throughput 
> (measured in ops/sec). One of the ways that Riak CS achieves this on top of 
> Riak is by introducing concurrency between the end-user and Riak.
> 
>> 
>> Thanks for your comments on my application and test approach!
> 
> Hope this helps,
> Reid
> 
> [1] http://docs.basho.com/riakcs/latest/
> [2] https://github.com/basho/basho_docs/issues/256
> 
> 
>> 
>> -Matt
>> 
>> -

Re: error running Erlang m/r job

2013-04-04 Thread Tom Zeng
Thanks Evan, that worked.


On Thu, Apr 4, 2013 at 11:14 AM, Evan Vigil-McClanahan <
emcclana...@basho.com> wrote:

> As of 1.3 the old client:mapreduce is deprecated, please use
> `riak_kv_mrc_pipe:mapred` instead.
>
> On Thu, Apr 4, 2013 at 9:07 AM, Tom Zeng  wrote:
> > Hi everyone,
> >
> > I am trying to run the Erlang m/r following the Riak Handbook, and got
> the
> > following error:
> >
> > (riak@127.0.0.1)4> ExtractTweet = fun(RObject, _, _) ->
> > (riak@127.0.0.1)4>  {struct, Obj} = mochijson2:decode(
> > (riak@127.0.0.1)4>riak_object:get_value(RObject)),
> > (riak@127.0.0.1)4>  [proplists:get_value(<<"tweet">>, Obj)]
> > (riak@127.0.0.1)4> end.
> > #Fun
> > (riak@127.0.0.1)5> C:mapred([{<<"tweets">>, <<"41399579391950848">>}],
> > (riak@127.0.0.1)5>   [{map, {qfun, ExtractTweet}, none, true}]).
> > ** exception error: undefined function riak_client:mapred/3
> >
> > I've been using JavaScript for m/r and just started using Erlang per
> Basho
> > engineers' recommendation at Riak DC meetup. Any help/pointers
> appreciated.
> >
> > Thanks
> > Tom
> > --
> > Tom Zeng
> > Director of Engineering
> > Intridea, Inc. | www.intridea.com
> > t...@intridea.com
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: User quota in riak-cs

2013-04-04 Thread Reid Draper

On Apr 4, 2013, at 4:52 PM, minotaurus  wrote:

> How can i enforce a quota for each user (tennant) in riak-cs ? Thanks. 

Riak CS does not currently support quotas of any sort. You can _observe_ the 
resources (i/o, storage) a user is using, but not limit it. This is something 
we may implement in the future.

Reid


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Exploring Riak, need to confirm throughput

2013-04-04 Thread Matthew MacClary
I am measuring throughput by the wall clock time needed to move a few gigs
of data into Riak. I have glanced at iostat, but I was not collecting data
from that tool at this point.

-Matt


On Thu, Apr 4, 2013 at 2:45 PM, Reid Draper  wrote:

>
> On Apr 4, 2013, at 4:14 PM, Matthew MacClary <
> maccl...@lifetime.oregonstate.edu> wrote:
>
> Thanks for the feedback. I made two changes to my test setup and saw
> better throughput:
>
> 1) Don't write to the same key over and over. Updating a key appears to be
> a lot slower than creating a new key
>
> 2) I used parallel PUTs
>
> The throughput I was measuring before was about 26MB/s on localhost. With
> these changes it went to around 200MB/s on a disk that can write at about
> 480MB/s. That is more the type of performance I need for the data store we
> have in mind. I am going to proceed with testing on 8 nodes with RAID0
> drives.
>
>
> How are you measuring throughput? HTTP throughput, or disk throughput with
> something like iostat?
>
>
> Here are some details of the testing I did if it will help others. I tried
> the test with 1MB, 10MB, and 20MB binary data. I didn't notice a big signal
> with regard to larger objects slowing things down.
>
>
> The issues with larger objects will likely only present themselves when
> you have more than one node.
>
>
> wget
> http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/1.2/1.2.1/rhel/5/riak-1.2.1-1.el5.x86_64.rpm
>
> sudo rpm -Uvh riak-1.2.1-1.el5.x86_64.rpm
> /usr/sbin/riak start
> mkdir data-dir && cd data-dir
> seq -w 0 100 | parallel dd if=/dev/zero of={}.10meg bs=8k count=1280
> http_proxy=   # don’t contact proxy
> time find . -name \*.10meg | parallel -j8 -n1 wget --post-file {}
> http://127.0.0.1:8098/riak/test1/{}
>
> During these tests I saw beam.smp jumping to 350-550 while watching %CPU
> under top. When I was seeing slower thoughput beam.smp was using much less
> CPU.
>
> Kind regards,
>
> -Matt
>
> On Wed, Apr 3, 2013 at 7:20 AM, Reid Draper  wrote:
>
>> inline:
>>
>>
>> On Apr 2, 2013, at 6:48 PM, Matthew MacClary <
>> maccl...@lifetime.oregonstate.edu> wrote:
>>
>> Hi all, I am new to this list. Thanks for taking the time to read my
>> questions! I just want to know if the data throughput I am seeing is
>> expected for the bitcask backend or if it is too low.
>>
>>  I am doing the preliminary feasibility study to decide if we should
>> implement a Riak data store. Our application involves rendering chunks of
>> data that range in size from about 1MB-9MB or so. This rendering work is
>> CPU intensive so it is spread over a bunch of compute nodes which write the
>> output into a data store.
>>
>>
>> Riak is not intended to store objects of this size, not at the moment
>> anyway. Riak CS [1], on the other hand, can store files up to several TB.
>> That being said, Riak CS may or may not have other qualities  you desire.
>> It's a known issue [2] that the Riak object size limitations should be
>> better documented.
>>
>>
>> After rendering, a second process consumes that data chunks from the data
>> store at a rate of about 480MB/s in a streaming configuration so there is >
>> 480MB/s of new data coming in at the same time the data is being read.
>>
>>
>> Is this a single-socket, or is there some concurrency here?
>>
>>
>> My testing so far involves a one node cluster on a dev box. What I wanted
>> to show is that Riak writes were limited by the hard disk throughput. So
>> far I haven't seen writes to localhost come anywhere close to the hard disk
>> throughput:
>>
>> $ MYFILE=/tmp/output.png
>> $ dd if=/dev/zero of=$MYFILE bs=8k count=256k
>> 262144+0 records in
>> 262144+0 records out
>> 2147483648 bytes (2.1 GB) copied, 4.48906 seconds, 478 MB/s
>> $ rm $MYFILE
>>
>> So the hard disk throughput is around 478MB/s for this simple write test.
>>
>> The next test I did was to load a 39MB binary file into my one node
>> cluster. I used a script to do 12 POSTs with curl and 12 POSTSs with wget.
>>
>> curl --tcp-nodelay -XPOST http://${IP}:${PORT}/riak/test/file3 \
>> -H "Content-Type:application/octet-stream" \
>> --data-binary @${UPLOAD_FILE} \
>> --write-out "%{speed_upload}\n"
>>
>> wget --post-file ${UPLOAD_FILE} http://127.0.0.1:8098/riak/test/file1
>>
>> What I found was that I could get only about 26MB/s with this command
>> line testing. Does this seam about right? Should I see an 18x slow down
>> over the write speed of the disk drive?
>>
>>
>> Was this running the 24 (12 * 2) uploads in serial or parallel? With a
>> single-threaded workload, you're unlikely to get Riak to be able to
>> saturate a disk. Furthermore, there are design decisions in Riak at the
>> moment that make it less than optimal for single objects of 39MB.
>> Single-object high throughput (measured in MB) is more in the wheelhouse of
>> Riak CS than Riak on it's own, which is primarily designed for low-latency
>> and high-throughput (measured in ops/sec). One of the ways that Riak CS
>> achieves this o

Re: Exploring Riak, need to confirm throughput

2013-04-04 Thread Matthew MacClary
PBC is certainly something I have on my list of things to explore.
Conceptually I am not sure if the speed gains from this protocol will be
apparent with large binary payloads. I thought that main speed gains were
from 1) more compact binary representation and 2) lower interpretation
overhead. In my situation I already have a largish binary payload that does
not need to be parsed. I could be wrong and may find that out as I explore
this further.

-Matt


On Thu, Apr 4, 2013 at 1:45 PM, Shuhao  wrote:

> Just as a side note, you might want to retry the test with PBC. While I
> have only did testings with < 10kb documents, my tests indicates that PBC
> is twice as fast as HTTP in almost all cases.
>
> Shuhao
>
>
> On 13-04-04 04:14 PM, Matthew MacClary wrote:
>
>> Thanks for the feedback. I made two changes to my test setup and saw
>> better
>> throughput:
>>
>> 1) Don't write to the same key over and over. Updating a key appears to be
>> a lot slower than creating a new key
>>
>> 2) I used parallel PUTs
>>
>> The throughput I was measuring before was about 26MB/s on localhost. With
>> these changes it went to around 200MB/s on a disk that can write at about
>> 480MB/s. That is more the type of performance I need for the data store we
>> have in mind. I am going to proceed with testing on 8 nodes with RAID0
>> drives.
>>
>> Here are some details of the testing I did if it will help others. I tried
>> the test with 1MB, 10MB, and 20MB binary data. I didn't notice a big
>> signal
>> with regard to larger objects slowing things down.
>>
>> wget
>> http://downloads.basho.com.s3-**website-us-east-1.amazonaws.**
>> com/riak/1.2/1.2.1/rhel/5/**riak-1.2.1-1.el5.x86_64.rpm
>>
>> sudo rpm -Uvh riak-1.2.1-1.el5.x86_64.rpm
>> /usr/sbin/riak start
>> mkdir data-dir && cd data-dir
>> seq -w 0 100 | parallel dd if=/dev/zero of={}.10meg bs=8k count=1280
>> http_proxy=   # don’t contact proxy
>> time find . -name \*.10meg | parallel -j8 -n1 wget --post-file {}
>> http://127.0.0.1:8098/riak/**test1/{}
>>
>> During these tests I saw beam.smp jumping to 350-550 while watching %CPU
>> under top. When I was seeing slower thoughput beam.smp was using much less
>> CPU.
>>
>> Kind regards,
>>
>> -Matt
>>
>> On Wed, Apr 3, 2013 at 7:20 AM, Reid Draper  wrote:
>>
>>  inline:
>>>
>>>
>>> On Apr 2, 2013, at 6:48 PM, Matthew MacClary <
>>> macclary@lifetime.oregonstate.**edu >
>>> wrote:
>>>
>>> Hi all, I am new to this list. Thanks for taking the time to read my
>>> questions! I just want to know if the data throughput I am seeing is
>>> expected for the bitcask backend or if it is too low.
>>>
>>> I am doing the preliminary feasibility study to decide if we should
>>> implement a Riak data store. Our application involves rendering chunks of
>>> data that range in size from about 1MB-9MB or so. This rendering work is
>>> CPU intensive so it is spread over a bunch of compute nodes which write
>>> the
>>> output into a data store.
>>>
>>>
>>> Riak is not intended to store objects of this size, not at the moment
>>> anyway. Riak CS [1], on the other hand, can store files up to several TB.
>>> That being said, Riak CS may or may not have other qualities  you desire.
>>> It's a known issue [2] that the Riak object size limitations should be
>>> better documented.
>>>
>>>
>>> After rendering, a second process consumes that data chunks from the data
>>> store at a rate of about 480MB/s in a streaming configuration so there
>>> is >
>>> 480MB/s of new data coming in at the same time the data is being read.
>>>
>>>
>>> Is this a single-socket, or is there some concurrency here?
>>>
>>>
>>> My testing so far involves a one node cluster on a dev box. What I wanted
>>> to show is that Riak writes were limited by the hard disk throughput. So
>>> far I haven't seen writes to localhost come anywhere close to the hard
>>> disk
>>> throughput:
>>>
>>> $ MYFILE=/tmp/output.png
>>> $ dd if=/dev/zero of=$MYFILE bs=8k count=256k
>>> 262144+0 records in
>>> 262144+0 records out
>>> 2147483648 bytes (2.1 GB) copied, 4.48906 seconds, 478 MB/s
>>> $ rm $MYFILE
>>>
>>> So the hard disk throughput is around 478MB/s for this simple write test.
>>>
>>> The next test I did was to load a 39MB binary file into my one node
>>> cluster. I used a script to do 12 POSTs with curl and 12 POSTSs with
>>> wget.
>>>
>>> curl --tcp-nodelay -XPOST http://${IP}:${PORT}/riak/**test/file3 \
>>>  -H "Content-Type:application/**octet-stream" \
>>>  --data-binary @${UPLOAD_FILE} \
>>>  --write-out "%{speed_upload}\n"
>>>
>>> wget --post-file ${UPLOAD_FILE} 
>>> http://127.0.0.1:8098/riak/**test/file1
>>>
>>> What I found was that I could get only about 26MB/s with this command
>>> line
>>> testing. Does this seam about right? Should I see an 18x slow down over
>>> the
>>> write speed

Re: Do nodes always need to restart after backend selection?

2013-04-04 Thread Toby Corkindale

Hi Jared,

I'm afraid I am still a little confused after reading your reply, so I'd 
like to check something.


If I understand correctly, the reboot of nodes is only required if the 
default settings in app.config are changed, and one can change anything 
else on-the-fly?


So therefore, in the following scenario, I could issue these commands 
and never need to reboot any nodes?


Riak backend = Multi, with Bitcask (default) and Leveldb.

PUT /buckets/myBucket/myKey
# Key is stored in Bitcask

PUT /buckets/myNewBucket/props
{ backend: Leveldb }

PUT /buckets/myBucket/myOtherKey
# Key is stored in Leveldb backend


If I change the backend, do I lose any keys that were previously 
available in the original backend or are they migrated? (I'd expect to 
lose them)



Thanks for your patience,
Toby



On 05/04/13 00:51, Jared Morrow wrote:

Toby,

That particular page is talking about changing the default settings of
the backend of a bucket.  In that specific case, if you want to change
the default behavior in your app.config file a restart is necessary.
  One particularly important detail there is you don't need to restart
*all* nodes at the same time.  Restarting one node at a time is
sufficient and recommended so you don't have any cluster downtime.

For setting common bucket properties, you do not need to restart the
node.  If you want to change the n_val of a bucket for instance, you can
just change it from your client on all nodes.  That page explains at the
bottom how to set them on the erlang console or curl, but most people
use their chosen client to set bucket properties before writing values.
   Here is an example using the Java Client
http://docs.basho.com/java/latest/cookbooks/buckets/.  In general it
doesn't matter if your client supports HTTP or protocol buffers, both
API's  support bucket
property changes.

Hope that helps,
Jared

On Wed, Apr 3, 2013 at 10:14 PM, Toby Corkindale
mailto:toby.corkind...@strategicdata.com.au>> wrote:

Hi,
According to the docs at the following URL, it is necessary to
reboot all Riak nodes after setting the bucket property for backend.
This seems really drastic, and we'd like to avoid having to do this!
See:
http://docs.basho.com/riak/1.__3.0/tutorials/choosing-a-__backend/Multi/


I wondered if the restart of the whole cluster can be avoided?
Perhaps we could set the bucket properties prior to setting any keys
within it?

Thanks in advance,
Toby

_
riak-users mailing list
riak-users@lists.basho.com 
http://lists.basho.com/__mailman/listinfo/riak-users___lists.basho.com






___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Can Bitcask expiry_secs vary between backends?

2013-04-04 Thread Toby Corkindale

On 04/04/13 17:43, Toby Corkindale wrote:

Hi,
Can we set Bitcask's expiry_secs value to be different per backend, in a
Multi-backend scenario?

Eg.

{multi_backend, [
 {<<"bitcask_short_ttl">>,  riak_kv_bitcask_backend, [
 {expiry_secs, 3600},   %% Expire items after one hour
 {expiry_grace_time, 600}
 ]},
 {<<"bitcask_long_ttl">>,  riak_kv_bitcask_backend, [
 {expiry_secs, 86400},   %% Expire items after one day
 {expiry_grace_time, 3600}
 ]},
 {<<"eleveldb_mult">>, riak_kv_eleveldb_backend, [
 ]}
]},


And if we're re-using bitcask in this way, do we need to specify
anything else, such as different data directories per backend?



Answering my own question here, but..
Experimentation seems to indicate that
a) You can set different expiry periods per bitcask backend.
b) You must set a unique data_dir for each backend, or else
everything crashes after a while.

-Toby


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Can Bitcask expiry_secs vary between backends?

2013-04-04 Thread Tom Santero
Hey Toby,

Your conclusions are correct.

Tom

On Thu, Apr 4, 2013 at 10:39 PM, Toby Corkindale <
toby.corkind...@strategicdata.com.au> wrote:

> On 04/04/13 17:43, Toby Corkindale wrote:
>
>> Hi,
>> Can we set Bitcask's expiry_secs value to be different per backend, in a
>> Multi-backend scenario?
>>
>> Eg.
>>
>> {multi_backend, [
>>  {<<"bitcask_short_ttl">>,  riak_kv_bitcask_backend, [
>>  {expiry_secs, 3600},   %% Expire items after one hour
>>  {expiry_grace_time, 600}
>>  ]},
>>  {<<"bitcask_long_ttl">>,  riak_kv_bitcask_backend, [
>>  {expiry_secs, 86400},   %% Expire items after one day
>>  {expiry_grace_time, 3600}
>>  ]},
>>  {<<"eleveldb_mult">>, riak_kv_eleveldb_backend, [
>>  ]}
>> ]},
>>
>>
>> And if we're re-using bitcask in this way, do we need to specify
>> anything else, such as different data directories per backend?
>>
>
>
> Answering my own question here, but..
> Experimentation seems to indicate that
> a) You can set different expiry periods per bitcask backend.
> b) You must set a unique data_dir for each backend, or else
> everything crashes after a while.
>
> -Toby
>
>
>
> __**_
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Do nodes always need to restart after backend selection?

2013-04-04 Thread Toby Corkindale

Answering my own question again, but hopefully that saves you time.

So, it appears that if a backend is changed via the JSON REST API, then 
all keys from the previous backend are now inaccessible. I think this 
also indicates that now the new backend is in use immediately, without 
any restarts required.


May I suggest that the wording on the API Reference page is improved? 
Both I and a colleague misunderstood it to mean that *any* change of 
backend required a restart.


Cheers,
Toby

On 05/04/13 11:27, Toby Corkindale wrote:

Hi Jared,

I'm afraid I am still a little confused after reading your reply, so I'd
like to check something.

If I understand correctly, the reboot of nodes is only required if the
default settings in app.config are changed, and one can change anything
else on-the-fly?

So therefore, in the following scenario, I could issue these commands
and never need to reboot any nodes?

Riak backend = Multi, with Bitcask (default) and Leveldb.

PUT /buckets/myBucket/myKey
# Key is stored in Bitcask

PUT /buckets/myNewBucket/props
{ backend: Leveldb }

PUT /buckets/myBucket/myOtherKey
# Key is stored in Leveldb backend


If I change the backend, do I lose any keys that were previously
available in the original backend or are they migrated? (I'd expect to
lose them)


Thanks for your patience,
Toby



On 05/04/13 00:51, Jared Morrow wrote:

Toby,

That particular page is talking about changing the default settings of
the backend of a bucket.  In that specific case, if you want to change
the default behavior in your app.config file a restart is necessary.
  One particularly important detail there is you don't need to restart
*all* nodes at the same time.  Restarting one node at a time is
sufficient and recommended so you don't have any cluster downtime.

For setting common bucket properties, you do not need to restart the
node.  If you want to change the n_val of a bucket for instance, you can
just change it from your client on all nodes.  That page explains at the
bottom how to set them on the erlang console or curl, but most people
use their chosen client to set bucket properties before writing values.
   Here is an example using the Java Client
http://docs.basho.com/java/latest/cookbooks/buckets/.  In general it
doesn't matter if your client supports HTTP or protocol buffers, both
API's  support bucket
property changes.

Hope that helps,
Jared

On Wed, Apr 3, 2013 at 10:14 PM, Toby Corkindale
mailto:toby.corkind...@strategicdata.com.au>> wrote:

Hi,
According to the docs at the following URL, it is necessary to
reboot all Riak nodes after setting the bucket property for backend.
This seems really drastic, and we'd like to avoid having to do this!
See:

http://docs.basho.com/riak/1.__3.0/tutorials/choosing-a-__backend/Multi/



I wondered if the restart of the whole cluster can be avoided?
Perhaps we could set the bucket properties prior to setting any keys
within it?



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: User quota in riak-cs

2013-04-04 Thread Kota Uenishi
One of possible solution to emulate quota is, to observe the space
usage of each user periodically, and when it exceeds your limits you
can "disable" the user by calling admin API.
http://docs.basho.com/riakcs/latest/cookbooks/Account-Management/#Enabling-and-Disabling-a-User-Account

On Fri, Apr 5, 2013 at 6:46 AM, Reid Draper  wrote:
>
> On Apr 4, 2013, at 4:52 PM, minotaurus  wrote:
>
>> How can i enforce a quota for each user (tennant) in riak-cs ? Thanks.
>
> Riak CS does not currently support quotas of any sort. You can _observe_ the 
> resources (i/o, storage) a user is using, but not limit it. This is something 
> we may implement in the future.
>
> Reid
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



-- 
Kota UENISHI / @kuenishi
Basho Japan KK

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: User quota in riak-cs

2013-04-04 Thread minotaurus
 Disabling completely the users seems a little too much. It is possible to
only deny the user to write new files and still be able to read and delete
files from his account ?




--
View this message in context: 
http://riak-users.197444.n3.nabble.com/User-quota-in-riak-cs-tp4027486p4027498.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: User quota in riak-cs

2013-04-04 Thread Kota Uenishi
As Reid mentioned, that is something we may implement in the future...

On Fri, Apr 5, 2013 at 1:39 PM, minotaurus  wrote:
>  Disabling completely the users seems a little too much. It is possible to
> only deny the user to write new files and still be able to read and delete
> files from his account ?
>
>
>
>
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/User-quota-in-riak-cs-tp4027486p4027498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



-- 
Kota UENISHI / @kuenishi
Basho Japan KK

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com