[
https://issues.apache.org/jira/browse/SPARK-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-14230.
-------------------------------
Resolution: Won't Fix
> Config the start time (jitter) for streaming jobs
> -------------------------------------------------
>
> Key: SPARK-14230
> URL: https://issues.apache.org/jira/browse/SPARK-14230
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Reporter: Liyin Tang
>
> Currently, RecurringTimer will normalize the start time. For instance, if
> batch duration is 1 min, all the job will start exactly at 1 min boundary.
> This actually adds some burden to the streaming source. Assuming the source
> is Kafka, and there is a list of streaming jobs with 1 min batch duration,
> then at first few seconds of each min, high network traffic will be observed
> in Kafka. This makes Kafka capacity planning tricky.
> It will be great to have an option in the streaming context to set the job
> start time. In this way, user can add a jitter for the start time for each,
> and make Kafka fetch_request much smooth across the duration window.
> {code}
> class RecurringTimer {
> def getStartTime(): Long = {
> (math.floor(clock.currentTime.toDouble / period) + 1).toLong * period +
> jitter
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]