Re: Spark Streaming - Shared hashmaps

2014-03-26 Thread Bryan Bryan
Thanks for your support, This is my idea of the project, i'm a newbie so please forgive my misunderstandings: Spark streaming will collect requests, for example: create a table, append records to a table, erase a table (it's just an example). With spark streaming i can filter the messages by key

Re: Spark Streaming - Shared hashmaps

2014-03-26 Thread Tathagata Das
When you say "launch long-running tasks" does it mean long running Spark jobs/tasks, or long-running tasks in another system? If the rate of requests from Kafka is not low (in terms of records per second), you could collect the records in the driver, and maintain the "shared bag" in the driver. A

Spark Streaming - Shared hashmaps

2014-03-26 Thread Bryan Bryan
Hi there, I have read about the two fundamental shared features in spark (broadcasting variables and accumulators), but this is what i need. I'm using spark streaming in order to get requests from Kafka, these requests may launch long-running tasks, and i need to control them: 1) Keep them in a