from:"Cassa L"

Accessing Cassandra data from Spark Shell

2016-05-09 Thread Cassa L

Hi, Has anyone tried accessing Cassandra data using SparkShell? How do you do it? Can you use HiveContext for Cassandra data? I'm using community version of Cassandra-3.0 Thanks, LCassa

Re: Accessing Cassandra data from Spark Shell

2016-05-17 Thread Cassa L

upport page will probably > give you a good steer to get started even if you’re not using Instaclustr: > https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra- > > > > Cheers > > Ben > > > > On Tue, 10 May

Re: Accessing Cassandra data from Spark Shell

2016-05-18 Thread Cassa L

nnector with 1.4.x). The main trick is in > lining up all the versions and building an appropriate connector jar. > > > > Cheers > > Ben > > > > On Wed, 18 May 2016 at 15:40 Cassa L wrote: > > Hi, > > I followed instructions to run SparkShell with Spark-1.6. It

Spark Memory Error - Not enough space to cache broadcast

2016-06-13 Thread Cassa L

Hi, I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all good for some time until I start getting below errors. Once these errors come, job start to lag behind and I see that job has scheduling

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-14 Thread Cassa L

Hi, I would appreciate any clue on this. It has become a bottleneck for our spark job. On Mon, Jun 13, 2016 at 2:56 PM, Cassa L wrote: > Hi, > > I'm using spark 1.5.1 version. I am reading data from Kafka into Spark and > writing it into Cassandra after processing it. Spark jo

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-15 Thread Cassa L

n option, >> probably worth a try. >> >> Cheers >> Ben >> >> On Wed, 15 Jun 2016 at 08:48 Cassa L wrote: >> >>> Hi, >>> I would appreciate any clue on this. It has become a bottleneck for our >>> spark job. >>> >>>

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-16 Thread Cassa L

> https://medium.com/@jaceklaskowski/ > Mastering Apache Spark http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Mon, Jun 13, 2016 at 11:56 PM, Cassa L wrote: > > Hi, > > > > I'm using spark 1.5.1 version. I am

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-16 Thread Cassa L

; How does your "I am reading data from Kafka into Spark and writing it >> into Cassandra after processing it." pipeline look like? >> >> Pozdrawiam, >> Jacek Laskowski >> >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark http:

Regarding sliding window example from Databricks for DStream

2016-01-11 Thread Cassa L

Hi, I'm trying to work with sliding window example given by databricks. https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/windows.html It works fine as expected. My question is how do I determine when the last phase of of slider has reached. I w

Re: Regarding sliding window example from Databricks for DStream

2016-01-12 Thread Cassa L

Any thoughts over this? I want to know when window duration is complete and not the sliding window. Is there a way I can catch end of Window Duration or do I need to keep track of it and how? LCassa On Mon, Jan 11, 2016 at 3:09 PM, Cassa L wrote: > Hi, > I'm trying to work w

Rule Engine for Spark

2015-11-03 Thread Cassa L

Hi, Has anyone used rule engine with spark streaming? I have a case where data is streaming from Kafka and I need to apply some rules on it (instead of hard coding in a code). Thanks, LCassa

Protobuff 3.0 for Spark

2015-11-04 Thread Cassa L

Hi, Does spark support protobuff 3.0? I used protobuff 2.5 with spark-1.4 built for HDP 2.3. Given that protobuff has compatibility issues , want to know if spark supports protbuff 3.0 LCassa

Re: Rule Engine for Spark

2015-11-04 Thread Cassa L

nough… > > -adrian > > From: Stefano Baghino > Date: Wednesday, November 4, 2015 at 10:15 AM > To: Cassa L > Cc: user > Subject: Re: Rule Engine for Spark > > Hi LCassa, > unfortunately I don't have actual experience on this matter, however for a > similar use

Re: Rule Engine for Spark

2015-11-04 Thread Cassa L

ok. Let me try it. Thanks, LCassa On Wed, Nov 4, 2015 at 4:44 PM, Cheng, Hao wrote: > Or try Streaming SQL? Which is a simple layer on top of the Spark > Streaming. J > > > > https://github.com/Intel-bigdata/spark-streamingsql > > > > > > *From:* Cassa

how to group timestamp data and filter on it

2015-11-18 Thread Cassa L

Hi, I have a data stream (JavaDStream) in following format- timestamp=second1, map(key1=value1, key2=value2) timestamp=second2,map(key1=value3, key2=value4) timestamp=second2, map(key1=value1, key2=value5) I want to group data by 'timestamp' first and then filter each RDD for Key1=value1 or key1

Spark streaming job hangs

2015-11-30 Thread Cassa L

Hi, I am reading data from Kafka into spark. It runs fine for sometime but then hangs forever with following output. I don't see and errors in logs. How do I debug this? 2015-12-01 06:04:30,697 [dag-scheduler-event-loop] INFO (Logging.scala:59) - Adding task set 19.0 with 4 tasks 2015-12-01 06:0

Protobuf error when streaming from Kafka

2015-08-24 Thread Cassa L

Hi, I am storing messages in Kafka using protobuf and reading them into Spark. I upgraded protobuf version from 2.4.1 to 2.5.0. I got "java.lang.UnsupportedOperationException" for older messages. However, even for new messages I get the same error. Spark does convert it though. I see my messages.

Re: Protobuf error when streaming from Kafka

2015-08-25 Thread Cassa L

12) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) On Mon, Aug 24, 2015 at 6:53 PM, Ted Yu wrote: > Can you show the complete stack trace ? > > Which Spark / Kafka release are you using ? > > Thanks > > On Mon, Aug 24, 2015 at 4:58 PM, Cassa L wrote: > >> Hi, >> I am storin

Re: Protobuf error when streaming from Kafka

2015-08-25 Thread Cassa L

Submit.scala:170) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > > On Mon, Aug 24, 2015 at 6:53 PM, Ted Yu wrote: > > Can you show

Re: Protobuf error when streaming from Kafka

2015-08-25 Thread Cassa L

Do you think this binary would have issue? Do I need to build spark from source code? On Tue, Aug 25, 2015 at 1:06 PM, Cassa L wrote: > I downloaded below binary version of spark. > spark-1.4.1-bin-cdh4 > > On Tue, Aug 25, 2015 at 1:03 PM, java8964 wrote: > >> Did your

SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cassa L

Hi, I was going through SSL setup of Kafka. https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka However, I am also using Spark-Kafka streaming to read data from Kafka. Is there a way to activate SSL for spark streaming API or not possible at all? Thanks, LCassa

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cassa L

o its not supported. > > Thanks, > Harsha > > > On August 28, 2015 at 11:00:30 AM, Cassa L (lcas...@gmail.com) wrote: > > Hi, > I was going through SSL setup of Kafka. > https://cwiki.apache.org/confluence/display/KAFKA/Deploying+SSL+for+Kafka > However, I am also using Spa

Re: SSL between Kafka and Spark Streaming API

2015-08-28 Thread Cassa L

a/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala > > On Fri, Aug 28, 2015 at 11:32 AM, Cassa L wrote: > > > Hi I am using below Spark jars with Direct Stream API. > > spark-streaming-kafka_2.10 > > > > When I look at its pom.xml, Kafka librarie

How to send RDD result to REST API?

2015-08-28 Thread Cassa L

Hi, If I have RDD that counts something e.g.: JavaPairDStream successMsgCounts = successMsgs .flatMap(buffer -> Arrays.asList(buffer.getType())) .mapToPair(txnType -> new Tuple2("Success " + txnType, 1)) .reduceByKey((count1, count2) -> count1 +

Re: How to send RDD result to REST API?

2015-08-31 Thread Cassa L

may have seen this: > > https://www.paypal-engineering.com/2014/02/13/hello-newman-a-rest-client-for-scala/ > > On Fri, Aug 28, 2015 at 9:35 PM, Cassa L wrote: > >> Hi, >> If I have RDD that counts something e.g.: >> >> JavaPairDStream successMsgCo

Reading old tweets from twitter in spark

2016-10-26 Thread Cassa L

Hi, I am using Spark Streaming to read tweets from twitter. It works fine. Now I want to be able to fetch older tweets in my spark code. Twitter4j has API to set date http://twitter4j.org/oldjavadocs/4.0.4/twitter4j/Query.html Is there a way to set this using TwitterUtils or do I need to write dif

Custom receiver for WebSocket in Spark not working

2016-11-02 Thread Cassa L

Hi, I am using spark 1.6. I wrote a custom receiver to read from WebSocket. But when I start my spark job, it connects to the WebSocket but doesn't get any message. Same code, if I write as separate scala class, it works and prints messages from WebSocket. Is anything missing in my Spark Code? Th

Spark job server pros and cons

2016-12-09 Thread Cassa L

Hi, So far, I ran spark jobs directly using spark-submit options. I have a use case to use Spark Job server to run the job. I wanted to find out PROS and CONs of using this job server? If anyone can share it, it will be great. My jobs usually connected to multiple data sources like Kafka, Custom r

Spark 2.0 and Oracle 12.1 error

2017-07-19 Thread Cassa L

Hi, I am trying to use Spark to read from Oracle (12.1) table using Spark 2.0. My table has JSON data. I am getting below exception in my code. Any clue? > java.sql.SQLException: Unsupported type -101 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$executio

Spark-2.0 and Oracle 12.1 error: Unsupported type -101

2017-07-19 Thread Cassa L

Hi, I am trying to read data into Spark from Oracle using ojdb7 driver. The data is in JSON format. I am getting below error. Any idea on how to resolve it? ava.sql.SQLException: Unsupported type -101 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$dat

How to use Update statement or call stored procedure of Oracle from Spark

2017-07-20 Thread Cassa L

Hi, I want to use Spark to parallelize some update operations on Oracle database. However I could not find a way to call Update statements (Update Employee WHERE ???) , use transactions or call stored procedures from Spark/JDBC. Has anyone had this use case before and how did you solve it? Thanks,

Re: Spark 2.0 and Oracle 12.1 error

2017-07-21 Thread Cassa L

ssues in the latest > release. > > Thanks > > Xiao > > > On Wed, 19 Jul 2017 at 11:10 PM Cassa L wrote: > >> Hi, >> I am trying to use Spark to read from Oracle (12.1) table using Spark >> 2.0. My table has JSON data. I am getting below exception in my co

Re: Spark 2.0 and Oracle 12.1 error

2017-07-21 Thread Cassa L

"UnitPrice" : 19.95, "UPCCode" : 13131092899 }, "Quantity" : 9.0 }, { "ItemNumber" : 2, "Part" : { "Descrip

Spark Structured Streaming - Spark Consumer does not display messages

2017-07-21 Thread Cassa L

Hi, This is first time I am trying structured streaming with Kafka. I have simple code to read from Kafka and display it on the console. Message is in JSON format. However, when I run my code nothin after below line gets printed. 17/07/21 13:43:41 INFO AppInfoParser: Kafka commitId : a7a17cdec9eaa

Fwd: Spark Structured Streaming - Spark Consumer does not display messages

2017-07-21 Thread Cassa L

Hi, This is first time I am trying structured streaming with Kafka. I have simple code to read from Kafka and display it on the console. Message is in JSON format. However, when I run my code nothin after below line gets printed. 17/07/21 13:43:41 INFO AppInfoParser: Kafka commitId : a7a17cdec9eaa

Re: Spark 2.0 and Oracle 12.1 error

2017-07-24 Thread Cassa L

Hi Another related question to this. Has anyone tried transactions using Oracle JDBC and spark. How do you do it given that code will be distributed on workers. Do I combine certain queries to make sure they don't get distributed? Regards, Leena On Fri, Jul 21, 2017 at 1:50 PM, Cassa L

Re: Why don't I see my spark jobs running in parallel in Cassandra/Spark DSE cluster?

2017-10-26 Thread Cassa L

No, I dont use Yarn. This is standalone spark that comes with DataStax Enterprise version of Cassandra. On Thu, Oct 26, 2017 at 11:22 PM, Jörn Franke wrote: > Do you use yarn ? Then you need to configure the queues with the right > scheduler and method. > > On 27. Oct 2017, at 08

37 matches

Mail list logo