Re: Spark Streaming and database access (e.g. MySQL)

2014-09-10 Thread Mayur Rustagi
I think she is checking for blanks? But if the RDD is blank then nothing will happen, no db connections etc. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Mon, Sep 8, 2014 at 1:32 PM, Tobias Pfeiffer wrote: > Hi, > > O

Re: Spark Streaming and database access (e.g. MySQL)

2014-09-08 Thread Tobias Pfeiffer
Hi, On Mon, Sep 8, 2014 at 4:39 PM, Sean Owen wrote: > > > if (rdd.take (1).size == 1) { > > rdd foreachPartition { iterator => > I was wondering: Since take() is an output operation, isn't it computed twice (once for the take(1), once during the iteration)? O

Re: Spark Streaming and database access (e.g. MySQL)

2014-09-08 Thread Sean Owen
That should be OK, since the iterator is definitely consumed, and therefore the connection actually done with, at the end of a 'foreach' method. You might put the close in a finally block. On Mon, Sep 8, 2014 at 12:29 AM, Soumitra Kumar wrote: > I have the following code: > > stream foreachRDD {

Re: Spark Streaming and database access (e.g. MySQL)

2014-09-07 Thread Soumitra Kumar
I have the following code: stream foreachRDD { rdd => if (rdd.take (1).size == 1) { rdd foreachPartition { iterator => initDbConnection () iterator foreach { write to db

Re: Spark Streaming and database access (e.g. MySQL)

2014-09-07 Thread Sean Owen
... I'd call out that last bit as actually tricky: "close off the driver" See this message for the right-est way to do that, along with the right way to open DB connections remotely instead of trying to serialize them: http://mail-archives.apache.org/mod_mbox/spark-user/201407.mbox/%3CCAPH-c_O9kQ

Re: Spark Streaming and database access (e.g. MySQL)

2014-09-07 Thread Mayur Rustagi
Standard pattern is to initialize the mysql jdbc driver in your mappartition call , update database & then close off the driver. Couple of gotchas 1. New driver initiated for all your partitions 2. If the effect(inserts & updates) is not idempotent, so if your server crashes, Spark will replay upda

Spark Streaming and database access (e.g. MySQL)

2014-09-06 Thread jchen
Hi, Has someone tried using Spark Streaming with MySQL (or any other database/data store)? I can write to MySQL at the beginning of the driver application. However, when I am trying to write the result of every streaming processing window to MySQL, it fails with the following error: org.apache.sp