pegasas commented on code in PR #23135: URL: https://github.com/apache/flink/pull/23135#discussion_r1285223853
########## docs/content/docs/concepts/stateful-stream-processing.md: ########## @@ -342,26 +342,3 @@ give *exactly once* guarantees even in *at least once* mode. {{< /hint >}} {{< top >}} - -## State and Fault Tolerance in Batch Programs - -Flink executes [batch programs]({{< ref "docs/dev/dataset/overview" >}}) as a special case of -streaming programs, where the streams are bounded (finite number of elements). -A *DataSet* is treated internally as a stream of data. The concepts above thus -apply to batch programs in the same way as well as they apply to streaming -programs, with minor exceptions: - - - [Fault tolerance for batch programs]({{< ref "docs/ops/state/task_failure_recovery" >}}) - does not use checkpointing. Recovery happens by fully replaying the - streams. That is possible, because inputs are bounded. This pushes the - cost more towards the recovery, but makes the regular processing cheaper, - because it avoids checkpoints. - - - Stateful operations in the DataSet API use simplified in-memory/out-of-core - data structures, rather than key/value indexes. - - - The DataSet API introduces special synchronized (superstep-based) - iterations, which are only possible on bounded streams. For details, check - out the [iteration docs]({{< ref "docs/dev/dataset/iterations" >}}). Review Comment: sure. This is a good point. I will search the others to see if any other same pattern, which could be executed in BATCH mode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org