Re: Possible problem with disk latency

2015-02-26 Thread Ja Sam
Hi, I found many simmilar lines in log: INFO [SlabPoolCleaner] 2015-02-24 12:28:19,557 ColumnFamilyStore.java:850 - Enqueuing flush of customer_events: 95299485 (5%) on-heap, 0 (0%) off-heap INFO [MemtableFlushWriter:1465] 2015-02-24 12:28:19,569 Memtable.java:339 - Writing Memtable-customer_eve

Re: Possible problem with disk latency

2015-02-26 Thread Roland Etzenhammer
Hi, 8GB Heap is a good value already - going above 8GB will often result in noticeable gc pause times in java, but you can give 12G a try just to see if that helps (and turn it back down again). You can add a "Heap Used" graph in opscenter to get a quick overview of your heap state. Best reg

Re: Possible problem with disk latency

2015-02-26 Thread Ja Sam
Hi, Ron I look deep into my cassandra files and SSTables created during last day are less than 20MB. Piotrek p.s. Your tips are really useful at least I am starting to finding where exactly the problem is. On Thu, Feb 26, 2015 at 3:11 PM, Ja Sam wrote: > We did this query, most our files are l

Re: Possible problem with disk latency

2015-02-26 Thread Ja Sam
We did this query, most our files are less than 100MB. Our heap setting are like (they are calculatwed using scipr in cassandra.env): MAX_HEAP_SIZE="8GB" HEAP_NEWSIZE="2GB" which is maximum recommended by DataStax. What values do you think we should try? On Thu, Feb 26, 2015 at 10:06 AM, Rol

Re: Possible problem with disk latency

2015-02-26 Thread Roland Etzenhammer
Hi Piotrek, your disks are mostly idle as far as I can see (the one with 17% busy isn't that high on load). One thing came up to my mind did you look on the sizes of your sstables? I did this with something like find /var/lib/cassandra/data -type f -size -1k -name "*Data.db" | wc find /var/lib/c

Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi, Check how many active CompactionExecutors is showing in "nodetool tpstats". Maybe your concurrent_compactors is too low. Enforce 1 per CPU core, even it's the default value on 2.1. Some of our nodes were running with 2 compactors, but we have an 8 core CPU... After that monitor your nodes to b

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
Hi, One more thing. Hinted Handoff for last week for all nodes was less than 5. For me every READ is a problem because it must open too many files (3 SSTables), which occurs as an error in reads, repairs, etc. Regards Piotrek On Wed, Feb 25, 2015 at 8:32 PM, Ja Sam wrote: > Hi, > It is not o

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
Hi, It is not obvious, because data is replicated to second data center. We check it "manually" for random records we put into Cassandra and we find all of them in secondary DC. We know about every single GC failure, but this doesn't change anything. The problem with GC failure is only one: restart

Re: Possible problem with disk latency

2015-02-25 Thread daemeon reiydelle
I think you may have a vicious circle of errors: because your data is not properly replicated to the neighbour, it is not replicating to the secondary data center (yeah, obvious). I would suspect the GC errors are (also obviously) the result of a backlog of compactions that take out the neighbour (

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
Hi Roni, The repair results is following (we run it Friday): Cannot proceed on repair because a neighbor (/192.168.61.201) is dead: session failed But to be honest the neighbor did not died. It seemed to trigger a series of full GC events on the initiating node. The results form logs are: [2015-0

Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi Piotr, Are your repairs finishing without errors? Regards, Roni Balthazar On 25 February 2015 at 15:43, Ja Sam wrote: > Hi, Roni, > They aren't exactly balanced but as I wrote before they are in range from > 2500-6000. > If you need exactly data I will check them tomorrow morning. But all n

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
Hi, Roni, They aren't exactly balanced but as I wrote before they are in range from 2500-6000. If you need exactly data I will check them tomorrow morning. But all nodes in AGRAF have small increase of pending compactions during last week, which is "wrong direction" I will check in the morning get

Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi Piotr, What about the nodes on AGRAF? Are the pending tasks balanced between this DC nodes as well? You can check the pending compactions on each node. Also try to run "nodetool getcompactionthroughput" on all nodes and check if the compaction throughput is set to 999. Cheers, Roni Balthazar

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
Hi Roni, It is not balanced. As I wrote you last week I have problems only in DC in which we writes (on screen it is named as AGRAF: https://drive.google.com/file/d/0B4N_AbBPGGwLR21CZk9OV1kxVDA/view). The problem is on ALL nodes in this dc. In second DC (ZETO) only one node have more than 30 SSTab

Re: Possible problem with disk latency

2015-02-25 Thread Roni Balthazar
Hi Ja, How are the pending compactions distributed between the nodes? Run "nodetool compactionstats" on all of your nodes and check if the pendings tasks are balanced or they are concentrated in only few nodes. You also can check the if the SSTable count is balanced running "nodetool cfstats" on y

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
I do NOT have SSD. I have normal HDD group by JBOD. My CF have SizeTieredCompactionStrategy I am using local quorum for reads and writes. To be precise I have a lot of writes and almost 0 reads. I changed "cold_reads_to_omit" to 0.0 as someone suggest me. I used set compactionthrouput to 999. So i

Re: Possible problem with disk latency

2015-02-25 Thread Nate McCall
> > If You could be so kind and validate above and give me an answer is my > disk are real problems or not? And give me a tip what should I do with > above cluster? Maybe I have misconfiguration? > > > You disks are effectively idle. What consistency level are you using for reads and writes? Actua

Re: Possible problem with disk latency

2015-02-25 Thread Ja Sam
I read that I shouldn't install version less than 6 in the end. But I started with 2.1.0. Then I upgraded to 2.1.3. But as I know, I cannot downgrade it On Wed, Feb 25, 2015 at 12:05 PM, Carlos Rolo wrote: > Your latency doesn't seem that high that can cause that problem. I suspect > more of a

Re: Possible problem with disk latency

2015-02-25 Thread Carlos Rolo
Your latency doesn't seem that high that can cause that problem. I suspect more of a problem with the Cassandra version (2.1.3) than that with the hard drives. I didn't look deep into the information provided but for your reference, the only time I had serious (leading to OOM and all sort of weird

Possible problem with disk latency

2015-02-25 Thread Ja Sam
Hi, I write some question before about my problems with C* cluster. All my environment is described here: https://www.mail-archive.com/user@cassandra.apache.org/msg40982.html To sum up I have thousands SSTables in one DC and much much less in second. I write only to first DC. Anyway after reading