Hello, When using Flink+YARN (with HDFS) and having a long running Flink session (mode) cluster with a Flink client submitting jobs, the HDFS could have a replication factor greater than 1 (example 3).
So, I would like to know when and how any of the data (like event-data or batch-data) or code (like JAR) in a Flink job is saved to the HDFS and is replicated in the entire YARN cluster of nodes? For example, in streaming applications, would all the event-data only be in memory (RAM) until it reaches the DAG's sink and then must be saved into HDFS? Thank you, Piper