Unequal distribtion of Kafka partitions with respect to topic when reading from multiple topics using KafkaSource API with Flink 1.14.3

2023-01-18 Thread Kathula, Sandeep via user
Hi, I am using KafkaSource API read from 6 topics within Kafka. Flink version - 1.14.3. Each and every kafka topic my Flink pipeline reads from is having a different load but same number of partitions (lets say 30). For example partition 0 of topic 1 and partition 0 of topic 2 have different

Re: Beam with Flink runner - Issues when writing to S3 in Parquet Format

2021-09-15 Thread Kathula, Sandeep
nd/or heap, which could result in suboptimal performance. Due to the "connection loss" and timeout exceptions you describe I'd suppose there might be a lot of GC pressure. Jan On 9/14/21 5:20 PM, Kathula, Sandeep wrote: Hi, We have a simple Beam application which reads from Ka

Beam with Flink runner - Issues when writing to S3 in Parquet Format

2021-09-14 Thread Kathula, Sandeep
Hi, We have a simple Beam application which reads from Kafka, converts to parquet and write to S3 with Flink runner (Beam 2.29 and Flink 1.12). We have a fixed window of 5 minutes after conversion to PCollection and then writing to S3. We have around 320 columns in our data. Our intention is

Unable to read state Witten by Beam application with Flink runner using Flink's State Processor API

2021-07-27 Thread Kathula, Sandeep
Hi, We have a simple Beam application like a work count running with Flink runner (Beam 2.26 and Flink 1.9). We are using Beam’s value state. I am trying to read the state from savepoint using Flink's State Processor API but getting a NullPointerException. Converted the whole code into Pur

Beam flink runner job not keeping up with input rate after downscaling

2020-07-16 Thread Kathula, Sandeep
Hi, We started a Beam application with Flink runner with parallelism as 50. It is a stateful application which uses RocksDB as state store. We are using timers and Beam’s value state and bag state (which is same as List state of Flink). We are doing incremental checkpointing. With initial par

Flink not restoring from checkpoint when job manager fails even with HA

2020-06-06 Thread Kathula, Sandeep
Hi, We are running Flink in K8S. We used https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/jobmanager_high_availability.html to set high availability. We set max number of retries for a task to 2. After task fails twice and then the job manager fails. This is expected. But