subject:"Re\: Spark Streaming\: No parallelism in writing to database \(MySQL\)"

Re: Spark Streaming: No parallelism in writing to database (MySQL)

2014-09-29 Thread Yana Kadiyska

Could you kindly share if you've been able to run the streaming job over a long period of time? I did something very similar and the executors seem to run out of memory (how fast depends on how much data/memory they get). Just curious what your experience is On Fri, Sep 26, 2014 at 12:31 AM, madde

Re: Spark Streaming: No parallelism in writing to database (MySQL)

2014-09-25 Thread maddenpj

Yup it's all in the gist: https://gist.github.com/maddenpj/5032c76aeb330371a6e6 Lines 6-9 deal with setting up the driver specifically. This sets the driver up on each partition which keeps the connection pool around per record. -- View this message in context: http://apache-spark-user-list.10

Re: Spark Streaming: No parallelism in writing to database (MySQL)

2014-09-25 Thread Buntu Dev

Thanks for the update.. I'm interested in writing the results to MySQL as well, can you share some light or code sample on how you setup the driver/connection pool/etc.? On Thu, Sep 25, 2014 at 4:00 PM, maddenpj wrote: > Update for posterity, so once again I solved the problem shortly after > po

Re: Spark Streaming: No parallelism in writing to database (MySQL)

2014-09-25 Thread maddenpj

Update for posterity, so once again I solved the problem shortly after posting to the mailing list. So updateStateByKey uses the default partitioner, which in my case seemed like it was set to one. Changing my call from .updateStateByKey[Long](updateFn) -> .updateStateByKey[Long](updateFn, numPart