Re: Cassandra - Spark - Flume: best architecture for log analytics.

2015-07-23 Thread Ipremyadav
Though DSE cassandra comes with hadoop integration, this is clearly is use case for hadoop. Any reason why cassandra is your first choice? > On 23 Jul 2015, at 6:12 a.m., Pierre Devops wrote: > > Cassandra is not very good at massive read/bulk read if you need to retrieve > and compute a la

Slow performance because of used-up "Waste" in AtomicBTreeColumns

2015-07-23 Thread Petter. Andreas
Hello everyone, we are experiencing performance issues with Cassandra overloading effects (dropped mutations and node drop-outs) with the following workload: create table test (year bigint, spread bigint, time bigint, batchid bigint, value set, primary key ((year, spread), time, batchid)) inser

Re: Cassandra - Spark - Flume: best architecture for log analytics.

2015-07-23 Thread Edward Ribeiro
Disclaimer: I have worked for DataStax. Cassandra is fairly good for log analytics and has been used many places for that ( https://www.usenix.org/conference/lisa14/conference-program/presentation/josephsen ). Of course, requirements vary from place to place, but it has been a good fit. Spark and

Manual Indexing With Buckets

2015-07-23 Thread Anuj Wadehra
We have a primary table and we need search capability by batchid column. So we are creating a manual index for search by batch id. We are using buckets to restrict a row size in batch id index table to 50mb. As batch size may vary drastically ( ie one batch id may be associated to 100k row keys

Best Practise for Updating Index and Reporting Tables

2015-07-23 Thread Anuj Wadehra
We have a transaction table,3 manually created index tables and few tables for reporting.  One option is to go for atomic batch mutations so that for each transaction every index table and other reporting tables are updated synchronously.  Other option is to update other tables async, there m

Re: Slow performance because of used-up "Waste" in AtomicBTreeColumns

2015-07-23 Thread Graham Sanderson
Multiple writes to a single partition key are guaranteed to be atomic. Therefore there has to be some protection. First rule of thumb, don’t write at insanely high rates to the same partition key concurrently (you can probably avoid this, but hints as currently implemented suffer because the p

Re: Can't connect to Cassandra server

2015-07-23 Thread Chamila Wijayarathna
Hi Peer, I changed cassandra-env.sh and following are the parameters I used,' MAX_HEAP_SIZE="8G" HEAP_NEWSIZE="1600M" But I am still unable to start the server properly. But this time system.log has bit different logs. https://gist.github.com/cdwijayarathna/75f65a34d9e71829adaa Any idea on how

Re: Best Practise for Updating Index and Reporting Tables

2015-07-23 Thread Robert Wille
My guess is that you don’t understand what an atomic batch is, give that you used the phrase “updated synchronously”. Atomic batches do not provide isolation, and do not guarantee immediate consistency. The only thing an atomic batch guarantees is that all of the statements in the batch will eve

Re: Can't connect to Cassandra server

2015-07-23 Thread Surbhi Gupta
What is the output you are getting if you are issuing nodetool status command ... On 23 July 2015 at 11:30, Chamila Wijayarathna wrote: > Hi Peer, > > I changed cassandra-env.sh and following are the parameters I used,' > > MAX_HEAP_SIZE="8G" > HEAP_NEWSIZE="1600M" > > But I am still unable to s

Issues with SSL encrption after updating to 2.2.0 from 2.1.6

2015-07-23 Thread Carlos Scheidecker
Hello all, After updating to Cassandra 2.2.0 from 2.1.6 I am having SSL issues: My JVM is java version "1.8.0_45" Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) Ubuntu 14.04.2 LTS is on all nodes, they are the same. Below i

Re: Schema questions for data structures with recently-modified access patterns

2015-07-23 Thread Robert Wille
Carlos’ suggestion (nor yours) didn’t didn’t provide a way to query recently-modified documents. His updated suggestion provides a way to get recently-modified documents, but not ordered. On Jul 22, 2015, at 4:19 PM, Jack Krupansky mailto:jack.krupan...@gmail.com>> wrote: "No way to query rec

Re: Schema questions for data structures with recently-modified access patterns

2015-07-23 Thread Jack Krupansky
Maybe you could explain in more detail what you mean by recently modified documents, since that is precisely what I thought I suggested with descending ordering. -- Jack Krupansky On Thu, Jul 23, 2015 at 3:40 PM, Robert Wille wrote: > Carlos’ suggestion (nor yours) didn’t didn’t provide a way

Re: Issues with SSL encrption after updating to 2.2.0 from 2.1.6

2015-07-23 Thread Robert Coli
On Thu, Jul 23, 2015 at 12:40 PM, Carlos Scheidecker wrote: > After updating to Cassandra 2.2.0 from 2.1.6 I am having SSL issues: > If you aren't the other guy, you are the second report of this issue. You should file a JIRA on issues.apache.org, after searching to see if someone already has.

Re: Issues with SSL encrption after updating to 2.2.0 from 2.1.6

2015-07-23 Thread Carlos Scheidecker
OK, I can try that. Haven't issue a JIRA error yet so not me. I had also tried to have the unrestricted JCE for Java 8 in and the error has changed. http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html From: java.lang.NullPointerException: null at com.google.common

Re: Issues with SSL encrption after updating to 2.2.0 from 2.1.6

2015-07-23 Thread Carlos Scheidecker
Here it is, Robert, thanks! https://issues.apache.org/jira/browse/CASSANDRA-9884 On Thu, Jul 23, 2015 at 2:13 PM, Robert Coli wrote: > On Thu, Jul 23, 2015 at 12:40 PM, Carlos Scheidecker > wrote: > >> After updating to Cassandra 2.2.0 from 2.1.6 I am having SSL issues: >> > > If you aren't th

Reduced write performance when reading

2015-07-23 Thread Soerian Lieve
Hi, I am currently performing benchmarks on Cassandra. Independently from each other I am seeing ~100k writes/sec and ~50k reads/sec. When I read and write at the same time, writing drops down to ~1000 writes/sec and reading stays roughly the same. The heap used is the same as when only reading,

Re: Schema questions for data structures with recently-modified access patterns

2015-07-23 Thread Robert Wille
I obviously worded my original email poorly. I guess that’s what happens when you post at the end of the day just before quitting. I want to get a list of documents, ordered from most-recently modified to least-recently modified, with each document appearing exactly once. Jack, your schema does

Re: Reduced write performance when reading

2015-07-23 Thread Jeff Ferland
My immediate guess: your transaction logs are on the same media as your sstables and your OS prioritizes read requests. -Jeff > On Jul 23, 2015, at 2:51 PM, Soerian Lieve wrote: > > Hi, > > I am currently performing benchmarks on Cassandra. Independently from each > other I am seeing ~100k w

Re: Reduced write performance when reading

2015-07-23 Thread Soerian Lieve
I set up RAID0 after experiencing highly imbalanced disk usage with a JBOD setup so my transaction logs are indeed on the same media as the sstables. Is there any alternative to setting up RAID0 that doesn't have this issue? On Thu, Jul 23, 2015 at 4:03 PM, Jeff Ferland wrote: > My immediate gue

Re: Reduced write performance when reading

2015-07-23 Thread Jeff Ferland
Imbalanced disk use is ok in itself. It’s only saturated throughput that’s harmful. RAID 0 does give more consistent throughput and balancing, but that’s another story. As for your situation with SSD drive, you can probably tweak this by changing the scheduler is set to noop, or read up on htt

Re: Schema questions for data structures with recently-modified access patterns

2015-07-23 Thread Jack Krupansky
Concurrent update should not be problematic. Duplicate entries should not be created. If it appears to be, explain your apparent issue so we can see whether it is a real issue. But at least from all of the details you have disclosed so far, there does not appear to be any indication that this type