I have the following code: stream foreachRDD { rdd => if (rdd.take (1).size == 1) { rdd foreachPartition { iterator => initDbConnection () iterator foreach { write to db } closeDbConnection () } } }
On Sun, Sep 7, 2014 at 1:26 PM, Sean Owen <so...@cloudera.com> wrote: > ... I'd call out that last bit as actually tricky: "close off the driver" > > See this message for the right-est way to do that, along with the > right way to open DB connections remotely instead of trying to > serialize them: > > > http://mail-archives.apache.org/mod_mbox/spark-user/201407.mbox/%3CCAPH-c_O9kQO6yJ4khXUVdO=+D4vj=JfG2tP9eqn5RPko=dr...@mail.gmail.com%3E > > On Sun, Sep 7, 2014 at 4:19 PM, Mayur Rustagi <mayur.rust...@gmail.com> > wrote: > > Standard pattern is to initialize the mysql jdbc driver in your > mappartition > > call , update database & then close off the driver. > > Couple of gotchas > > 1. New driver initiated for all your partitions > > 2. If the effect(inserts & updates) is not idempotent, so if your server > > crashes, Spark will replay updates to mysql & may cause data corruption. > > > > > > Regards > > Mayur > > > > Mayur Rustagi > > Ph: +1 (760) 203 3257 > > http://www.sigmoidanalytics.com > > @mayur_rustagi > > > > > > On Sun, Sep 7, 2014 at 11:54 AM, jchen <jc...@pivotal.io> wrote: > >> > >> Hi, > >> > >> Has someone tried using Spark Streaming with MySQL (or any other > >> database/data store)? I can write to MySQL at the beginning of the > driver > >> application. However, when I am trying to write the result of every > >> streaming processing window to MySQL, it fails with the following error: > >> > >> org.apache.spark.SparkException: Job aborted due to stage failure: Task > >> not > >> serializable: java.io.NotSerializableException: > >> com.mysql.jdbc.JDBC4PreparedStatement > >> > >> I think it is because the statement object should be serializable, in > >> order > >> to be executed on the worker node. Has someone tried the similar cases? > >> Example code will be very helpful. My intension is to execute > >> INSERT/UPDATE/DELETE/SELECT statements for each sliding window. > >> > >> Thanks, > >> JC > >> > >> > >> > >> -- > >> View this message in context: > >> > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-and-database-access-e-g-MySQL-tp13644.html > >> Sent from the Apache Spark User List mailing list archive at Nabble.com. > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >> For additional commands, e-mail: user-h...@spark.apache.org > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >