And probably the original source code https://gist.github.com/koen-dejonghe/39c10357607c698c0b04
On Tue, Sep 22, 2015 at 10:37 AM, Petr Novak <oss.mli...@gmail.com> wrote: > To complete design pattern: > > http://stackoverflow.com/questions/30450763/spark-streaming-and-connection-pool-implementation > > Petr > > On Mon, Sep 21, 2015 at 10:02 PM, Romi Kuntsman <r...@totango.com> wrote: > >> Cody, that's a great reference! >> As shown there - the best way to connect to an external database from the >> workers is to create a connection pool on (each) worker. >> The driver mass pass, via broadcast, the connection string, but not the >> connect object itself and not the spark context. >> >> On Mon, Sep 21, 2015 at 5:31 PM Cody Koeninger <c...@koeninger.org> >> wrote: >> >>> That isn't accurate, I think you're confused about foreach. >>> >>> Look at >>> >>> >>> http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd >>> >>> >>> On Mon, Sep 21, 2015 at 7:36 AM, Romi Kuntsman <r...@totango.com> wrote: >>> >>>> foreach is something that runs on the driver, not the workers. >>>> >>>> if you want to perform some function on each record from cassandra, you >>>> need to do cassandraRdd.map(func), which will run distributed on the spark >>>> workers >>>> >>>> *Romi Kuntsman*, *Big Data Engineer* >>>> http://www.totango.com >>>> >>>> On Mon, Sep 21, 2015 at 3:29 PM, Priya Ch <learnings.chitt...@gmail.com >>>> > wrote: >>>> >>>>> Yes, but i need to read from cassandra db within a spark >>>>> transformation..something like.. >>>>> >>>>> dstream.forachRDD{ >>>>> >>>>> rdd=> rdd.foreach { >>>>> message => >>>>> sc.cassandraTable() >>>>> . >>>>> . >>>>> . >>>>> } >>>>> } >>>>> >>>>> Since rdd.foreach gets executed on workers, how can i make >>>>> sparkContext available on workers ??? >>>>> >>>>> Regards, >>>>> Padma Ch >>>>> >>>>> On Mon, Sep 21, 2015 at 5:10 PM, Ted Yu <yuzhih...@gmail.com> wrote: >>>>> >>>>>> You can use broadcast variable for passing connection information. >>>>>> >>>>>> Cheers >>>>>> >>>>>> On Sep 21, 2015, at 4:27 AM, Priya Ch <learnings.chitt...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> can i use this sparkContext on executors ?? >>>>>> In my application, i have scenario of reading from db for certain >>>>>> records in rdd. Hence I need sparkContext to read from DB (cassandra in >>>>>> our >>>>>> case), >>>>>> >>>>>> If sparkContext couldn't be sent to executors , what is the >>>>>> workaround for this ?????? >>>>>> >>>>>> On Mon, Sep 21, 2015 at 3:06 PM, Petr Novak <oss.mli...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> add @transient? >>>>>>> >>>>>>> On Mon, Sep 21, 2015 at 11:27 AM, Priya Ch < >>>>>>> learnings.chitt...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello All, >>>>>>>> >>>>>>>> How can i pass sparkContext as a parameter to a method in an >>>>>>>> object. Because passing sparkContext is giving me TaskNotSerializable >>>>>>>> Exception. >>>>>>>> >>>>>>>> How can i achieve this ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Padma Ch >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >