Thanks Andrey and Tyler! That was useful :)

Do you guys have any idea why the 10 MB writes took a lot of time in my
case although I'm using Large VMs which have plenty of resources? Or do you
think this latency is expected?
I'm trying to see how much time is spent in the network versus processing
CPU cycles of the nodes; any suggestion for a good profiling tool?



On Thu, Jul 18, 2013 at 5:50 PM, Tyler Hobbs <ty...@datastax.com> wrote:

> The default limit is 16mb, but realistically you should try to keep writes
> under 10mb, breaking up large values into multiple columns/rows if
> necessary.
>
>
> On Thu, Jul 18, 2013 at 4:31 PM, Andrey Ilinykh <ailin...@gmail.com>wrote:
>
>> there is a limit of thrift message ( thrift_max_message_length_in_mb), by
>> default it is 64m if I'm not mistaken. This is your limit.
>>
>>
>> On Thu, Jul 18, 2013 at 2:03 PM, hajjat <haj...@purdue.edu> wrote:
>>
>>> Hi,
>>>
>>> Is there a recommended data size for Reads/Writes in Cassandra? I tried
>>> inserting 10 MB objects and the latency I got was pretty high. Also, I
>>> was
>>> never able to insert larger objects (say 50 MB) since Cassandra kept
>>> crashing when I tried that.
>>>
>>> Here is my experiment setup:
>>> I used two Large VMs in EC2 within the same data-center. Inserts have ALL
>>> consistency (strong consistency).  The latencies were as follows:
>>> Data size:      10 MB           1 MB            100 Bytes
>>> Latency:        250ms           50ms            8ms
>>>
>>> I've also done the same for two Large VMs across two data-centers. The
>>> latencies were around:
>>> Data size:      10 MB           1 MB            100 Bytes
>>> Latency:        1200ms          800ms   80ms
>>>
>>> 1) Ain't the 10 MB latency extremely high?
>>> 2) Is there a recommended data size to use with Cassandra (e.g., a few
>>> bytes
>>> up to 1 MB)?
>>> 3) Also, I tried inserting 50 MB data but Cassandra kept crashing. Does
>>> anybody know why? I thought the max data size should be up to 2 GB?
>>>
>>> Thanks,
>>> Mohammad
>>>
>>> PS. Here is my python code I use to insert into Cassandra. I put my
>>> stopwatch timers around the insert statement:
>>>     fh = open(TEST_FILE,'r')
>>>     data = str(fh.read())
>>>
>>>     POOL = ConnectionPool(keyspace, server_list=['localhost:9160'],
>>> timeout=None)
>>>     USER = ColumnFamily(POOL, 'User')
>>>     USER.insert('Ali', {'data':
>>>
>>> data},write_consistency_level=pycassa.cassandra.ttypes.ConsistencyLevel.ALL)
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Recommended-data-size-for-Reads-Writes-in-Cassandra-tp7589141.html
>>> Sent from the cassandra-u...@incubator.apache.org mailing list archive
>>> at Nabble.com.
>>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>



-- 
*Mohammad Hajjat*
*Ph.D. Student*
*Electrical and Computer Engineering*
*Purdue University*

Reply via email to