Flink TaskManager binding data port to single interface while rpc port binds to all?

2018-07-27 Thread David Corley
We're seeing an issue with our Flink TMs (1.4.2) where we explicitly set the TM data and RPC ports. When the TM comes up, we see the following bindings: == tcp6 0 0 :::9249 :::*LISTEN 2284/java tcp6 0 0 1

Continuous Monitoring of back-pressure

2019-01-31 Thread David Corley
I'm currently writing some code to convert the back-pressure REST API data into Prometheus-compatible output. I was just curious why back-pressure wasn't already exposed as a metric in the in-built Prometheus exporter? Is it because the thread-sampling is too intensive? Or too slow (particularly if

Is group.id required in Kafka connector for offsets to be stored in checkpoint?

2019-02-16 Thread David Corley
We've got a relatively simply job that reads in from Kafka, and writes to S3. We've had a couple of job failures where the consumer lag had built up, but after the restart, the lag was wiped out because our offset positions were lost and we consumed from latest offset. The job has checkpointing en

masters file only needed when using start-cluster.sh script?

2018-04-17 Thread David Corley
The HA documentation is a little confusing in that it suggests JM registration and discovery is done via Zookeeper, but it also recommends creating a static `masters` file listing all JMs. The only use I can currently see for the masters file is by the `start-cluster.sh` script. Thus, if we're not

Re: masters file only needed when using start-cluster.sh script?

2018-04-20 Thread David Corley
Great! Thanks Gary On 20 April 2018 at 08:22, Gary Yao wrote: > Hi David, > > You are right. If you don't use start-cluster.sh, the conf/masters file is > not > needed. > > Best, > Gary > > > On Wed, Apr 18, 2018 at 8:25 AM, David Corley > wrote: > &

Zookeeper DR backup needed for Flink HA mode?

2018-05-15 Thread David Corley
We're looking at DR scenarios for our Flink cluster. We already use Zookeeper for JM HA. We use a HDFS cluster that's replicated off-site, and our high-availability.zookeeper.storageDir property is configure to use HDFS. However, in the event of a site-failure, is it also essential that we have a

Will all records grouped using keyBy be allocated to a single subtask?

2023-08-03 Thread David Corley
I have a job using the keyBy function. The job parallelism is 40. My key is based on a field in the records that has 2000+ possible values My question is for the records for a given key, will they all be sent to the one subtask or be distributed evenly amongst the all 40 downstream operator sub tas