Large writes can sometimes put a lot of heap/GC pressure on the node, which can be an additional source of latency. Use the query tracing in Cassandra 1.2+ to get a better picture of where the latency is.
On Thu, Jul 18, 2013 at 6:18 PM, Mohammad Hajjat <haj...@purdue.edu> wrote: > Thanks Andrey and Tyler! That was useful :) > > Do you guys have any idea why the 10 MB writes took a lot of time in my > case although I'm using Large VMs which have plenty of resources? Or do you > think this latency is expected? > I'm trying to see how much time is spent in the network versus processing > CPU cycles of the nodes; any suggestion for a good profiling tool? > > > > On Thu, Jul 18, 2013 at 5:50 PM, Tyler Hobbs <ty...@datastax.com> wrote: > >> The default limit is 16mb, but realistically you should try to keep >> writes under 10mb, breaking up large values into multiple columns/rows if >> necessary. >> >> >> On Thu, Jul 18, 2013 at 4:31 PM, Andrey Ilinykh <ailin...@gmail.com>wrote: >> >>> there is a limit of thrift message ( thrift_max_message_length_in_mb), >>> by default it is 64m if I'm not mistaken. This is your limit. >>> >>> >>> On Thu, Jul 18, 2013 at 2:03 PM, hajjat <haj...@purdue.edu> wrote: >>> >>>> Hi, >>>> >>>> Is there a recommended data size for Reads/Writes in Cassandra? I tried >>>> inserting 10 MB objects and the latency I got was pretty high. Also, I >>>> was >>>> never able to insert larger objects (say 50 MB) since Cassandra kept >>>> crashing when I tried that. >>>> >>>> Here is my experiment setup: >>>> I used two Large VMs in EC2 within the same data-center. Inserts have >>>> ALL >>>> consistency (strong consistency). The latencies were as follows: >>>> Data size: 10 MB 1 MB 100 Bytes >>>> Latency: 250ms 50ms 8ms >>>> >>>> I've also done the same for two Large VMs across two data-centers. The >>>> latencies were around: >>>> Data size: 10 MB 1 MB 100 Bytes >>>> Latency: 1200ms 800ms 80ms >>>> >>>> 1) Ain't the 10 MB latency extremely high? >>>> 2) Is there a recommended data size to use with Cassandra (e.g., a few >>>> bytes >>>> up to 1 MB)? >>>> 3) Also, I tried inserting 50 MB data but Cassandra kept crashing. Does >>>> anybody know why? I thought the max data size should be up to 2 GB? >>>> >>>> Thanks, >>>> Mohammad >>>> >>>> PS. Here is my python code I use to insert into Cassandra. I put my >>>> stopwatch timers around the insert statement: >>>> fh = open(TEST_FILE,'r') >>>> data = str(fh.read()) >>>> >>>> POOL = ConnectionPool(keyspace, server_list=['localhost:9160'], >>>> timeout=None) >>>> USER = ColumnFamily(POOL, 'User') >>>> USER.insert('Ali', {'data': >>>> >>>> data},write_consistency_level=pycassa.cassandra.ttypes.ConsistencyLevel.ALL) >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Recommended-data-size-for-Reads-Writes-in-Cassandra-tp7589141.html >>>> Sent from the cassandra-u...@incubator.apache.org mailing list archive >>>> at Nabble.com. >>>> >>> >>> >> >> >> -- >> Tyler Hobbs >> DataStax <http://datastax.com/> >> > > > > -- > *Mohammad Hajjat* > *Ph.D. Student* > *Electrical and Computer Engineering* > *Purdue University* > -- Tyler Hobbs DataStax <http://datastax.com/>