Re: Query regarding spark on cassandra

2016-04-28 Thread Siddharth Verma
Anyways, thanks for your reply. On Thu, Apr 28, 2016 at 1:59 PM, Hannu Kröger wrote: > Ok, then I don’t understand the problem. > > Hannu > > On 28 Apr 2016, at 11:19, Siddharth Verma > wrote: > > Hi Hannu, > > Had the issue been caused due to read, the insert, and delete statement > would hav

Re: Query regarding spark on cassandra

2016-04-28 Thread Hannu Kröger
Ok, then I don’t understand the problem. Hannu > On 28 Apr 2016, at 11:19, Siddharth Verma > wrote: > > Hi Hannu, > > Had the issue been caused due to read, the insert, and delete statement would > have been erroneous. > "I saw the stdout from web-ui of spark, and the query along with true w

Re: Query regarding spark on cassandra

2016-04-28 Thread Siddharth Verma
Hi Hannu, Had the issue been caused due to read, the insert, and delete statement would have been erroneous. "I saw the stdout from web-ui of spark, and the query along with true was printed for both the queries.". The statements were correct as seen on the UI. Thanks, Siddharth Verma On Thu, A

Re: Query regarding spark on cassandra

2016-04-28 Thread Hannu Kröger
Hi, could it be consistency level issue? If you use ONE for reads and writes, might be that sometimes you don't get what you are writing. See: https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html Br, Hannu 2016-04-27 20:41 GMT+03:00 Siddharth Verma : > Hi, >

Re: Query regarding spark on cassandra

2016-04-27 Thread Siddharth Verma
Edit: 1. dc2 node has been removed. nodetool status shows only active nodes. 2. Repair done on all nodes. 3. Cassandra restarted Still it doesn't solve the problem. On Thu, Apr 28, 2016 at 9:00 AM, Siddharth Verma < verma.siddha...@snapdeal.com> wrote: > Hi, If the info could be used > we ar

Re: Query regarding spark on cassandra

2016-04-27 Thread Siddharth Verma
Hi, If the info could be used we are using two DCs dc1 - 3 nodes dc2 - 1 node however, dc2 has been down for 3-4 weeks, and we haven't removed it yet. spark slaves on same machines as the cassandra nodes. each node has two instances of slaves. spark master on a separate machine. If anyone could

Query regarding spark on cassandra

2016-04-27 Thread Siddharth Verma
Hi, I dont know, if someone has faced this problem or not. I am running a job where some data is loaded from cassandra table. From that data, i make some insert and delete statements. and execute it (using forEach) Code snippet: boolean deleteStatus= connector.openSession().execute(delete).wasAppl

Query regarding filter and where in spark on cassandra

2016-03-07 Thread Siddharth Verma
Hi, While working with spark running on top of cassandra, I wanted to do some filtering on data. It can be done either on server side(where clause while cassandraTable query is written) or on client side(filter transformation on rdd). Which one of them is preferred keeping performance and time in m

Re: Spark on cassandra

2015-11-13 Thread Ravi
I did join on single big table and it's working fine using code you showed below. Can we do table join on non partition key? Or not a primary key column ? Thanks, Ravi On Thu, Nov 12, 2015 at 5:41 AM DuyHai Doan wrote: > Hello Prem > > I believe it's better to ask your question on the ML of t

Re: Spark on cassandra

2015-11-12 Thread DuyHai Doan
Hello Prem I believe it's better to ask your question on the ML of the Spark Cassandra connector: http://groups.google.com/a/lists.datastax.com/forum/#!forum/spark-connector-user Second "we need to join multiple table from multiple keyspaces. How can we do that?", the response is given in your ex

Spark on cassandra

2015-11-12 Thread Prem Yadav
Hi, Is it better to use Spark APIs to do join on cassandra tables or should we use SPARK-SQL. We have been struggling with SPARK-SQL as we need to do multiple large table joins and there is always failure. I tried to do joins using the API like this: val join1 = sc.cassandraTable("Keyspace1","tabl

spark on cassandra

2014-08-08 Thread Prem Yadav
HI, are there any cluster specific prerequisites for running spark on Cassandra? I create two DCs. DC1 and DC2. DC1 had two cassandra nodes with vnodes. I create two nodes in DC2 with murmu partitioning and set num_token: 1. Enabled Hadoop and Spark and started DSE. I can verify that hadoop