Yes, our clients didn't specify the port so they are using 9042 by default.

On Thu, Jun 25, 2015 at 9:23 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:

> Hi Zhiyan,
>
> 2 - RF 2 will improve overall performance, but not about the result 2.0.*
> vs 2.1.*. Same comment about adding 3 nodes. Yet Cassandra is supposed to
> be linearly scalable, so...
> 3 - I guess this was the first thing to do. You did not answered about
> heap size. One of the main differences between 2.0 and 2.1 is memtables can
> now be stored off heap. So if you set a big Heap with a high memtable size,
> then you will let less space for page caching on 2.1. You should go with
> default and modify things incrementally to reach an objective (Latency /
> throughput / percentiles /...).
>
> About thrift vs native protocol, Thrift is becoming deprecated over time.
> You should stick with native an I think that Datastax driver allow CQL /
> native protocol only, you should be good to go. Basically does your clients
> use port 9042 (by default) ?
>
> C*heers,
>
> Alain
>
>
> 2015-06-25 17:36 GMT+02:00 Zhiyan Shao <zhiyan.s...@gmail.com>:
>
>> Thanks Alain,
>>
>> for 2, We tried CL one but the improvement is small. Will try RF 2 and
>> see. Maybe adding 3 more boxes will help too.
>> for 3,  we changed key cache back to default (100MB) and it helped
>> improving the perf but still worse than 2.0.14. We also noticed that hit
>> rate grew slower than 2.0.14.
>> for 4, we are querying 1 partition key each time. There are 5 rows on
>> average for each partition key.
>>
>> We are using datastax java driver so I guess it is native protocol. We
>> will try out 2.1.7 too.
>>
>> Thanks,
>> Zhiyan
>>
>> On Wed, Jun 24, 2015 at 11:48 PM, Alain RODRIGUEZ <arodr...@gmail.com>
>> wrote:
>>
>>> I am amazed to see that you don't have OOM with this setup...
>>>
>>> 1 - for performances and given Cassandra replication properties an I/O
>>> usage you might want to try with a Raid0. But I imagine this is tradeoff.
>>>
>>> 2 - A billion is quite a few and any of your nodes takes the full load.
>>> You might want to try with RF 2 and CL one if performance is what you are
>>> looking for.
>>>
>>> 3 - Using 50 GB of key cache is something I never saw and can't be good,
>>> since afaik, key cache is on heap and you don"t really want a heap bigger
>>> than 8 GB ( or 10/12 GB for some cases). Try with default heap size and key
>>> cache.
>>>
>>> 4 - Are you querying the set at once ? You might want to query rows one
>>> by one, maybe in a synchronous way to have back pressure.
>>>
>>> An other question would be: did you use native protocol or rather thrift
>>> ? ( http://www.datastax.com/dev/blog/cassandra-2-1-now-over-50-faster)
>>>
>>> BTW interesting benchmark, but having the right conf is interesting.
>>> Also you might want to go to 2.1.7 that mainly fixes a memory leak afaik.
>>>
>>> C*heers,
>>>
>>> Alain
>>> Le 25 juin 2015 01:23, "Zhiyan Shao" <zhiyan.s...@gmail.com> a écrit :
>>>
>>>> Hi,
>>>>
>>>> we recently experimented read performance on both versions and found
>>>> read is slower in 2.1.6. Here is our setup:
>>>>
>>>> 1. Machines: 3 physical hosts. Each node has 24 cores CPU, 256G memory
>>>> and 8x600GB SAS disks with raid 1.
>>>> 2. Replica is 3 and a billion rows of data is inserted.
>>>> 3. Key cache capacity is increased to 50G on each node.
>>>> 4. Keep querying the same set of a million partition keys in a loop.
>>>>
>>>> Result:
>>>> For 2.0.14, we can get an average of 6 ms while for 2.1.6, we can only
>>>> get 18 ms
>>>>
>>>> It seems key cache hit rate 0.011 is pretty low even though the same
>>>> set of keys were used. Has anybody done similar read performance testing?
>>>> Could you share your results?
>>>>
>>>> Thanks,
>>>> Zhiyan
>>>>
>>>
>>
>

Reply via email to