To complete design pattern:
http://stackoverflow.com/questions/30450763/spark-streaming-and-connection-pool-implementation

Petr

On Mon, Sep 21, 2015 at 10:02 PM, Romi Kuntsman <r...@totango.com> wrote:

> Cody, that's a great reference!
> As shown there - the best way to connect to an external database from the
> workers is to create a connection pool on (each) worker.
> The driver mass pass, via broadcast, the connection string, but not the
> connect object itself and not the spark context.
>
> On Mon, Sep 21, 2015 at 5:31 PM Cody Koeninger <c...@koeninger.org> wrote:
>
>> That isn't accurate, I think you're confused about foreach.
>>
>> Look at
>>
>>
>> http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd
>>
>>
>> On Mon, Sep 21, 2015 at 7:36 AM, Romi Kuntsman <r...@totango.com> wrote:
>>
>>> foreach is something that runs on the driver, not the workers.
>>>
>>> if you want to perform some function on each record from cassandra, you
>>> need to do cassandraRdd.map(func), which will run distributed on the spark
>>> workers
>>>
>>> *Romi Kuntsman*, *Big Data Engineer*
>>> http://www.totango.com
>>>
>>> On Mon, Sep 21, 2015 at 3:29 PM, Priya Ch <learnings.chitt...@gmail.com>
>>> wrote:
>>>
>>>> Yes, but i need to read from cassandra db within a spark
>>>> transformation..something like..
>>>>
>>>> dstream.forachRDD{
>>>>
>>>> rdd=> rdd.foreach {
>>>>  message =>
>>>>      sc.cassandraTable()
>>>>       .
>>>>       .
>>>>       .
>>>>     }
>>>> }
>>>>
>>>> Since rdd.foreach gets executed on workers, how can i make sparkContext
>>>> available on workers ???
>>>>
>>>> Regards,
>>>> Padma Ch
>>>>
>>>> On Mon, Sep 21, 2015 at 5:10 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> You can use broadcast variable for passing connection information.
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Sep 21, 2015, at 4:27 AM, Priya Ch <learnings.chitt...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> can i use this sparkContext on executors ??
>>>>> In my application, i have scenario of reading from db for certain
>>>>> records in rdd. Hence I need sparkContext to read from DB (cassandra in 
>>>>> our
>>>>> case),
>>>>>
>>>>> If sparkContext couldn't be sent to executors , what is the workaround
>>>>> for this ??????
>>>>>
>>>>> On Mon, Sep 21, 2015 at 3:06 PM, Petr Novak <oss.mli...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> add @transient?
>>>>>>
>>>>>> On Mon, Sep 21, 2015 at 11:27 AM, Priya Ch <
>>>>>> learnings.chitt...@gmail.com> wrote:
>>>>>>
>>>>>>> Hello All,
>>>>>>>
>>>>>>>     How can i pass sparkContext as a parameter to a method in an
>>>>>>> object. Because passing sparkContext is giving me TaskNotSerializable
>>>>>>> Exception.
>>>>>>>
>>>>>>> How can i achieve this ?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Padma Ch
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

Reply via email to