Re: Corrupted sstable and sstableloader

2013-07-21 Thread Jan Kesten
On 18.07.2013 19:19, Robert Coli wrote: Why not just determine which SSTable is corrupt, remove it from the restore set, then run a repair when you're done to be totally sure all data is on all nodes? This is what I did finally - was some kind of work, since sstableloader just stopped with

memtable overhead

2013-07-21 Thread Darren Smythe
Hi, How much overhead (in heap MB) does an empty memtable use? If I have many column families that aren't written to often, how much memory do these take up? TIA -- Darren

Re: Socket buffer size

2013-07-21 Thread Mohammad Hajjat
For (rpc_send_buff_size_in_bytes), I was able to try many values of this parameter. However, whenever I tried to configure ( internode_send_buff_size_in_bytes) Cassandra kept crashing. Has anyone tried configuring the (internode_send_buff_size_in_bytes) parameter? Here is the Traceback (most recen

Re: CPU Bound Writes

2013-07-21 Thread Mohammad Hajjat
Aaron, here is the source: http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning Thanks! On Sun, Jul 21, 2013 at 4:57 PM, aaron morton wrote: > > Wouldn't this make Writes disk-bound then? I think the documentation may > have been a bit misleading then "Insert-heavy workloads w

Are Writes disk-bound rather than CPU-bound?

2013-07-21 Thread hajjat
“/Insert-heavy workloads will actually be CPU-bound in Cassandra before being memory-bound/” However, from reading the documentation (http://www.datastax.com/docs/1.2/dml/about_writes) it seems the disk is the real bottleneck in Writes rather than the CPU. This is because everything is *first *wri

Re: CPU Bound Writes

2013-07-21 Thread aaron morton
> Wouldn't this make Writes disk-bound then? I think the documentation may have > been a bit misleading then "Insert-heavy workloads will actually be CPU-bound > in Cassandra before being memory-bound"? What is the source of the quote ? Cheers - Aaron Morton Cassandra Consultant

Re: increasing replication level in live DC

2013-07-21 Thread aaron morton
background http://wiki.apache.org/cassandra/Operations#Replication Try: * Assuming you are on CL one now, there should be little impact. * to reduce impact set read repair chance on the CF's to 0 * check the stream throughput and compaction throughput in yaml, assume you have * increase the RF *

Re: Auto Discovery of Hosts by Clients

2013-07-21 Thread aaron morton
Give the app the same nodes you have in the seed lists. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 20/07/2013, at 9:32 AM, sankalp kohli wrote: > With Auto discovery, you can provide the DC you are local to and it will

Re: How to avoid inter-dc read requests

2013-07-21 Thread sankalp kohli
Slice query does not trigger background read repair. Implement Read Repair on Range Queries On Sun, Jul 21, 2013 at 1:40 PM, sankalp kohli wrote: > There can be multiple reasons for that > 1) Background read repairs. > 2) Your data is not cons

Re: NPE in CompactionExecutor

2013-07-21 Thread aaron morton
What version are you running ? > ERROR [CompactionExecutor:38] 2013-07-19 17:01:34,494 CassandraDaemon.java > (line 192) Exception in thread Thread[CompactionExecutor:38,1,main] > java.lang.NullPointerException What' the full error stack ? > Not sure if this is related or not, but I'm also get

Re: How to avoid inter-dc read requests

2013-07-21 Thread sankalp kohli
There can be multiple reasons for that 1) Background read repairs. 2) Your data is not consistent and leading to read repairs. 3) For writes, irrespective of the consistency used, a single write request will goto other DC 4) You might be running other nodetools commands like repair. read_repair_cha

Re: CL1 and CLQ with 5 nodes cluster and 3 alives node

2013-07-21 Thread aaron morton
> I'm experiencing some problems after 3 years of cassandra in production (from > 0.6 to 1.0.6) -- for 2 times in 3 weeks 2 nodes crashed with OutOfMemory > Exception. Take a look at how many rows you have and the size of the bloom filters. You may have grown :) If you have more than 500Million

Re: Exception while writing compsite column names

2013-07-21 Thread Nate McCall
Use MutatorImpl or ColumnFamilyTemplate API. Examples repectively: https://github.com/zznate/cassandra-tutorial/blob/master/src/main/java/com/datastax/tutorial/composite/CompositeDataLoader.java http://hector-client.github.io/hector/build/html/content/getting_started.html#update The approach you t

Re: Recommended data size for Reads/Writes in Cassandra

2013-07-21 Thread aaron morton
> Do you guys have any idea why the 10 MB writes took a lot of time in my case > although I'm using Large VMs which have plenty of resources? If you are talking about m1.large IMHO they are under powered, at a minimum you should be using m1.xlarge. Cheers - Aaron Morton Cassan

Re: Huge query Cassandra limits

2013-07-21 Thread aaron morton
> .The combination was performing better was querying for 500 rows at a time > with 1000 columns while different combinations, such as 125 rows for 4000 > columns or 1000 rows for 500 columns, were about the 15% slower. I would rarely go above 100 rows, specially if you are asking for 1000 colum

Re: Pig load data with cassandrastorage and slice filter param

2013-07-21 Thread aaron morton
It's easier for people to help if you can give an example of your Column Family, what have tried, what the output was and what you expected. > > grunt> rows = LOAD > > 'cassandra://MyKeyspace/MyColumnFamily?slice_start=C2&slice_end=C4&limit=1&reversed=true' > > USING CassandraStorage(); Appear

Re: How to avoid inter-dc read requests

2013-07-21 Thread Omar Shibli
One more thing, I'm doing a lot of key slice read requests, is that supposed to change anything? On Sun, Jul 21, 2013 at 8:21 PM, Omar Shibli wrote: > I'm seeing a lot of inter-dc read requests, although I've followed > DataStax guidelines for multi-dc deployment > http://www.datastax.com/dev/blo

How to avoid inter-dc read requests

2013-07-21 Thread Omar Shibli
I'm seeing a lot of inter-dc read requests, although I've followed DataStax guidelines for multi-dc deployment http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers Here is my setup: 2 data centers within the same region (AWS) Targeting DC, RP 3, 6 nodes Analytic DC, RP

Re: funnel analytics, how to query for reports etc.

2013-07-21 Thread Vladimir Prudnikov
This can be done easily, Use normal column family to store the sequence of events where key is session #ID identifying one use interaction with a website, column names are TimeUUID values and column value id of the event (do not write something like "user added product to shopping cart", something

salutations..

2013-07-21 Thread Matt K
http://surabi.aditac.com/buzclsd/fhuyjzypcsrjwwem Matt K 7/21/2013 1:45:44 PM

Cassandra 2.0 - AssertionError in ArrayBackedSortedColumns

2013-07-21 Thread Soumava Ghosh
Hi, I'm taking a look at the Check and Set functionalities provided by the cas() API provided by cassandra 2.0 (the code available on git). I'm running a few tests on a small sized cluster (replication factor 3, consistency level quorum) with a few clients. I've observed a lot of cases seem to hit