I’ve got a Flink job that uses a HybridSource. It reads a bunch of S3 files
and then transitions to a Kafka Topic where it stays processing live data.
Everything gets written to a warehouse where users build and run reports. In
takes about 36 hours to read data from the beginning before it’s
I have a case where my Flink job needs to consume multiple sources. I have a
topic in Kafka where the order of consuming is important. Because the cost of
S3 is much less than storage on Kafka, we have a job that sinks to S3. The
topic in Kafka can then retain just 3 days worth of data. My j
Thank you, I figured it out. My IAM policy was missing some actions. Seems I
needed to give it “*” for it to work.
From: Tamir Sagi
Date: Wednesday, June 9, 2021 at 6:02 AM
To: Yang Wang , Kurtis Walker
Cc: user
Subject: Re: Using s3 bucket for high availability
EXTERNAL EMAIL
I'
specifying the region for the bucket?
I’ve been searching the doc and haven’t found anything like that.
From: Kurtis Walker
Date: Tuesday, June 8, 2021 at 10:55 AM
To: user
Subject: Using s3 bucket for high availability
Hello,
I’m trying to set up my flink native Kubernetes cluster with Hig
Hello,
I’m trying to set up my flink native Kubernetes cluster with High
availability. Here’s the relevant config:
.
From: Roman Khachatryan
Date: Thursday, April 29, 2021 at 2:49 PM
To: Kurtis Walker
Cc: user@flink.apache.org
Subject: Re: Backpressure configuration
EXTERNAL EMAIL
Hello Kurt,
Assuming that your sink is blocking, I would first make sure that it
is not chained with the preceding operators
Hello,
I’m building a POC for a Flink app that loads data from Kafka in to a
Greenplum data warehouse. I’ve built a custom sink operation that will bulk
load a number of rows. I’m using a global window, triggering on number of
records, to collect the rows for each sink. I’m finding that whi