There are several database apis that use a thread local or singleton reference to a connection pool (we use ScalikeJDBC currently, but there are others).
You can use mapPartitions earlier in the chain to make sure the connection pool is set up on that executor, then use it inside updateStateByKey On Fri, Jun 12, 2015 at 10:07 AM, algermissen1971 < [email protected]> wrote: > Hi, > > I have a scenario with spark streaming, where I need to write to a > database from within updateStateByKey[1]. > > That means that inside my update function I need a connection. > > I have so far understood that I should create a new (lazy) connection for > every partition. But since I am not working in foreachRDD I wonder where I > can iterate over the partitions. > > Should I use mapPartitions() somewhere up the chain? > > Jan > > > > [1] The use case being saving ‘done' sessions during web tracking. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
