Sorry I don't have a diagram to share.  your understanding of how I are using
spark application is right.
 
Its  kafka topic with 6 partitions, so spark is able to create 6 parallel
consumers/executors.

Thought of using Airflow is interesting. I will explore this option more.

Other idea of using ProcessingTime trigger(every 60 seconds) to build a new
query to load data from s3 file and use results from this query with
ContinuousTrigger query - I will try this option also.

Thanks again!

 







--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to