Hi,
I'm using Flink StreamingFileSink running in one AWS account (A) to another
(B). I'm also leveraging a SecurityConfiguration in the CFN to assume a
role in account B so that when I write there the files are owned by account
B which then in turn allows account B to delegate to other AWS account
mark (I sent it
> downstream only for metric purposes) but want to tell impala after each
> commit which partitions changed, regardless of the value from the watermark.
>
> Best regards
> Theo
>
> --
> *Von: *"Yun Gao"
> *An: *"Rober
I am replacing an M/R job with a Streaming job using the StreamingFileSink
and there is a requirement to generate an empty _SUCCESS file like the old
Hadoop job. I have to implement a similar Batch job to read from backup
files in case of outages or downtime.
The Batch job question was answered he
How are folks here managing deployments in production?
We are deploying Flink jobs on EMR manually at the moment but would like to
move towards some form of automation before anything goes into production.
Adding additional EMR Steps to a long running cluster to deploy or update
jobs seems like th
Hi Padarn for what it's worth I am using DataDog metrics on EMR with Flink
1.7.1 and this here my flink-conf configuration:
- Classification: flink-conf
ConfigurationProperties:
metrics.reporter.dghttp.class:
org.apache.flink.metrics.datadog.DatadogHttpReporter
metrics.reporter.dghttp.ap
Hi folks,
I'm hoping to get some deeper clarification on which framework, Flink or
KStreams, to use in a given scenario. I've read over the following blog
article which I think sets a great baseline understanding of the
differences between those frameworks but I would like to get some outside
opin
u would have to check whether the bucket assignment and file naming
> is completely deterministic) or before reprocessing from backup remove the
> dirty files from the crashed job.
>
> Piotrek
>
> On 2 May 2019, at 23:10, Peter Groesbeck
> wrote:
>
> Hi all,
>
> I have
Hi all,
I have an application that reads from various Kafka topics and writes
parquet files to corresponding buckets on S3 using StreamingFileSink with
DateTimeBucketAssigner. The upstream application that writes to Kafka also
writes records as gzipped json files to date bucketed locations on S3 a