Hello Flink Users! I'm a Flink newbie at the early stages of deploying our first Flink cluster into production and I have a few questions about wiring up Flink with S3:
* We are going to use the HA configuration[1] from day one (we have existing zk infrastructure already). Can S3 be used as a state backend for the Job Manager? The documentation talks about using S3 as a state backend for TM[2] (and in particular for streaming), but I'm wondering if it's a suitable backend for the JM as well. * How do I configure S3 for Flink when I don't already have an existing Hadoop cluster? The documentation references the Hadoop configuration manifest[3], which kind of implies to me that I must already be running Hadoop (or at least have a properly configured Hadoop cluster). Is there an example somewhere of using S3 as a storage backend for a standalone cluster? * Bonus: I'm writing a Puppet module for installing/configuring/managing Flink in stand alone mode with an existing zk cluster. Are there any existing modules for this (I didn't find anything in the forge)? Would others in the community be interested if we added our module to the forge once complete? Thanks so much for your time and consideration. We look forward to using Flink in production! Cheers, Michael-Keith [1]: https://ci.apache.org/projects/flink/flink-docs-master/setup/jobmanager_high_availability.html#standalone-cluster-high-availability [2]: https://ci.apache.org/projects/flink/flink-docs-master/setup/aws.html#s3-simple-storage-service [3]: https://ci.apache.org/projects/flink/flink-docs-master/setup/aws.html#set-s3-filesystem