Thanks for your support, This is my idea of the project, i'm a newbie so
please forgive my misunderstandings:
Spark streaming will collect requests, for example: create a table, append
records to a table, erase a table (it's just an example).
With spark streaming i can filter the messages by key
When you say "launch long-running tasks" does it mean long running Spark
jobs/tasks, or long-running tasks in another system?
If the rate of requests from Kafka is not low (in terms of records per
second), you could collect the records in the driver, and maintain the
"shared bag" in the driver. A
Hi there,
I have read about the two fundamental shared features in spark
(broadcasting variables and accumulators), but this is what i need.
I'm using spark streaming in order to get requests from Kafka, these
requests may launch long-running tasks, and i need to control them:
1) Keep them in a