Uruka,

Well at least now your numbers make sense.  I wasn't trying to be snide in
my response, that wasn't my intention, so sorry if I appeared that way.  I
was attempting, poorly, to say that Memory and Bitcask shouldn't be close
to each other so you have other issues.

Now that you have apparently eliminated your network & load bottlenecks,
you can take a look at
http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/paying
particular attention to mounts, scheduler, and swapiness settings.
 Just today someone at Basho was mentioning they changed their scheduler
from CFQ to deadline and saw their load average go from 30.00 to 4.00.

After tuning mentioned above, Riak will perform better along side increased
disk performance.  I benchmarked on VM's, but those VM's were on SmartOS
running ZFS with a pretty fast raid-z setup.

-Jared

On Tue, Nov 6, 2012 at 3:37 PM, Uruka Dark <urukad...@gmail.com> wrote:

> First of all, thank you for your reply.
>
> Well, if you tell me that I can beat myself up trying to get what another
> person gets in a benchmark, then I don't understand what's the whole point
> in post your results here. I thought that you were trying to tell me that
> in an similar setup, you could do much better, and that I probably had some
> problem on my setup (and you were right). Considering how different our
> results are, I imagine that there is something wrong - by "wrong" I mean: I
> have a setup problem; my hardware is not up to the task and our
> environments are not that similar; etc. That's what I'm trying to figure
> out and also, I'm trying to know the tool.
>
> I have a few different scenarios to face: heavy write, heavy read, both of
> them, etc. Now, I'm considering the heavy write scenario and later I'll
> deal with others. If I jump from one scenario to another without at least a
> minimum solid conclusion, it will not help me.
>
> Based on what you said, I replaced the machine that was producing the load
> by a new one identical to the cluster machines (intel core i3 2.3GHz, 4GB
> RAM 1TB HD). Now I have 3 machines with the same setup in a gigabit network.
>
> I started using bitcask backend:
> https://dl.dropbox.com/u/308392/sum_bit.png
>
> Then I tried memory backend:
> https://dl.dropbox.com/u/308392/sum_mem.png
>
> Now we can see a major impact on the results. The memory backend could do
> much better with roughly 4000 ops/sec, but bitcask not so much, about +200
> ops/sec than the last result. Changing the third machine really worked.
>
> Just to see what would happen, I decided to try the benchmark locally (of
> corse, I knew that it would put a heavy load over the CPU). I ran the test
> from one of the cluster machines.
> Results:
>
> Bitcask:
> https://dl.dropbox.com/u/308392/sum_bit_local.png
>
> Memory:
> https://dl.dropbox.com/u/308392/sum_mem_local.png
>
> Well, it seems that my bottleneck is related to my disks.
> Once again, thank you.
> You helped me alot.
>
>
> On Mon, Nov 5, 2012 at 4:36 PM, Jared Morrow <ja...@basho.com> wrote:
>
>> So if you are getting close to similar numbers with the memory and
>> bitcask backends, you know that file IO isn't your bottleneck in the load
>> test. My guess is that the machine you are running the load with (where you
>> are running basho_bench) can't keep up with the test. I'm running
>> basho_bench on a core i7 on the same switch as the riak nodes, so that
>> could be enough of the difference. Again though, you can beat yourself up
>> trying to just get what another person gets in a benchmark, so it really
>> comes down to what you want out of it. What are your app's requirements?
>> What do you expect to be the requirements a year from now? What do your
>> access patterns look like, more read heavy or more write heavy?
>>
>> Right now, you are only doing bulk load experiments, what about
>> Write/Read/Update that simulate your expected usage pattern? What we have
>> done in the past as a good test is to do a bulk load like you are doing
>> with say 10 or 100 million keys. That'll make sure Riak is loaded with a
>> real world amount of data to start. Then run another pareto distribution
>> run (using something like {key_generator, {int_to_bin, {pareto_int,
>> 10000000}}}.) with a spread of puts/gets/updates like {operations,
>> [{get, 4},{put, 1},{update, 1}]}.. I'll tell you right now with a two
>> node cluster you probably aren't going to be happy with the results as that
>> is not how Riak was designed to work at its best. If you can swing it, add
>> two more nodes to the test, set N=2 instead of N=1 so you are getting at
>> least some data safety and run the above. If it meets the need of your app,
>> awesome. if it doesn't come back and discuss it or try another solution
>> with the same requirements.
>>
>> -Jared
>>
>>
>> <snip'd the history to pass through the mailing-list size requirement>
>
>
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to