Understanding Offsets State Storage in KafkaSource and DynamicKafkaSource

2025-03-23 Thread Chirag Dewan via user
Hi, Wanted to understand how KafkaSource stores the offsets of each partition in state.  Does it keep the offsets of each topic-partition in its checkpointed state or just commit back to Kafka on every checkpoint? And then how does DynamicKafkaSource work in this? Does it store the offsets for a

DynamicKafkaSource Offsets during Migration to Another Cluster

2025-01-20 Thread Chirag Dewan via user
Hi, I was experimenting with DynamicKafkaSource with 2 clusters. My use case is of a failover - when the active site fails, I want the Kafka Source to start reading data from the standby site.  I observed that DynamicKafkaSource resets the offsets on Cluster-2 back to -3 though it was already at

Re: Critical CVE-2024-47561 on Apache Avro

2024-11-03 Thread Chirag Dewan via user
are recommended to upgrade to version 1.11.4  or 1.12.0, which fix this issue." Cheers, Jim On Wed, Oct 30, 2024 at 1:26 AM Chirag Dewan via user wrote: Any view on this?  On Monday 28 October, 2024 at 04:16:17 pm IST, Chirag Dewan via user wrote: Hi, There is a critical CVE

Re: Critical CVE-2024-47561 on Apache Avro

2024-10-29 Thread Chirag Dewan via user
Any view on this?  On Monday 28 October, 2024 at 04:16:17 pm IST, Chirag Dewan via user wrote: Hi, There is a critical CVE on Apache Avro - NVD - CVE-2024-47561 Is there a released Flink version which has upgraded Avro to 1.11.4 or 1.12? If not, is it safe to upgrade just AVRO

Critical CVE-2024-47561 on Apache Avro

2024-10-28 Thread Chirag Dewan via user
Hi, There is a critical CVE on Apache Avro - NVD - CVE-2024-47561 Is there a released Flink version which has upgraded Avro to 1.11.4 or 1.12? If not, is it safe to upgrade just AVRO, keeping flink-avro on 1.16.3 (my current Flink version). Appreciate any inputs.  Thanks,Chirag | | | | | |

Jars REST API Disabled after web.submit was Disabled

2024-08-12 Thread Chirag Dewan via user
Hi, I have a standalone Flink cluster which I have started using the jobmanager.sh and taskmanager.sh scripts. For security reasons, I wanted to disable the submit and cancel jar features from the web ui but keep them enabled from the REST API so that my application can submit jars.  But when I

Re: Redis as a State Backend

2024-01-31 Thread Chirag Dewan via user
th Redis I guess. Best,Zakelly On Tue, Jan 30, 2024 at 2:15 PM Chirag Dewan via user wrote: Hi, I was looking at the FLIP-254: Redis Streams Connector and I was wondering if Flink ever considered Redis as a state backend? And if yes, why was it discarded compared to RocksDB?  If someone ca

Redis as a State Backend

2024-01-29 Thread Chirag Dewan via user
Hi, I was looking at the FLIP-254: Redis Streams Connector and I was wondering if Flink ever considered Redis as a state backend? And if yes, why was it discarded compared to RocksDB?  If someone can point me towards any deep dives on why RocksDB is a better fit as a state backend, it would be h

Re: Invalid Null Check in DefaultFileFilter

2023-10-26 Thread Chirag Dewan via user
of defensive programming for a public interface and the decision here is to be more lenient when facing potentially erroneous user input rather than blow up the whole application with a NullPointerException. Best,Alexander Fedulov On Thu, 26 Oct 2023 at 07:35, Chirag Dewan via user wrote: Hi

Re: Enhancing File Processing and Kafka Integration with Flink Jobs

2023-10-26 Thread Chirag Dewan via user
Hi Arjun, Flink's FileSource doesnt move or delete the files as of now. It will keep the files as is and remember the name of the file read in checkpointed state to ensure it doesnt read the same file twice. Flink's source API works in a way that single Enumerator operates on the JobManager. Th

Invalid Null Check in DefaultFileFilter

2023-10-25 Thread Chirag Dewan via user
Hi, I was looking at this check in DefaultFileFilter: public boolean test(Path path) { final String fileName = path.getName(); if (fileName == null || fileName.length() == 0) { return true; }Was wondering how can a file name be null? And if null, shouldnt this be return false? I

Re: Securing Keytab File in Flink

2023-09-26 Thread Chirag Dewan via user
* Rotate the keytab time to time* The keytab can be encrypted at rest but that's fully custom logic outside of Flink G On Fri, Sep 15, 2023 at 7:05 AM Chirag Dewan via user wrote: Hi, I am trying to implement a HDFS Source connector that can collect files from Kerberos enabled HDFS. As pe

Securing Keytab File in Flink

2023-09-14 Thread Chirag Dewan via user
Hi, I am trying to implement a HDFS Source connector that can collect files from Kerberos enabled HDFS. As per the Kerberos support, I have provided my keytab file to Job Managers and all the Task Managers. Now, I understand that keytab file is a security concern and if left unsecured can be use

Re: Keytab Setup on Kubernetes

2023-09-06 Thread Chirag Dewan via user
he.org/flink/flink-docs-master/docs/deployment/security/security-delegation-token/ G On Tue, Sep 5, 2023 at 1:31 PM Chirag Dewan via user wrote: Hi, I am trying to use the FileSource to collect files from HDFS. The HDFS cluster is secured and has Kerberos enabled. My Flink cluster runs on Ku

Keytab Setup on Kubernetes

2023-09-05 Thread Chirag Dewan via user
Hi, I am trying to use the FileSource to collect files from HDFS. The HDFS cluster is secured and has Kerberos enabled. My Flink cluster runs on Kubernetes (not using the Flink operator) with 2 Job Managers in HA and 3 Task Managers. I wanted to understand the correct way to configure the keytab

Re: Splitting in Stream Formats for File Source

2023-08-20 Thread Chirag Dewan via user
f97/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/reader/StreamFormat.java#L57 Best,Ron Chirag Dewan via user 于2023年8月17日周四 12:00写道: Hi,I am trying to collect files from HDFS in my DataStream job. I need to collect two types of files - CSV and Parquet.  I unders

Splitting in Stream Formats for File Source

2023-08-16 Thread Chirag Dewan via user
Hi,I am trying to collect files from HDFS in my DataStream job. I need to collect two types of files - CSV and Parquet.  I understand that Flink supports both formats, but in Streaming mode, Flink doesnt support splitting these formats. Splitting is only supported in Table API. I wanted to under

Flink Job across Data Centers

2023-04-12 Thread Chirag Dewan via user
Hi, Can anyone share any experience on running Flink jobs across data centers? I am trying to create a Multi site/Geo Replicated Kafka cluster. I want that my Flink job to be closely colocated with my Kafka multi site cluster. If the Flink job is bound to a single data center, I believe we will o

Questions on S3 File Sink Behavior

2023-03-29 Thread Chirag Dewan via user
Hi,   We are tying to use Flink's File sink to distribute files to AWS S3 storage. We are using Flink provided Hadoop s3a connector as plugin. We have some observations that we needed to clarify: 1. When using file sink for local filesystem distribution, we can see that the sink creates 3 se

Re: CSV File Sink in Streaming Use Case

2023-03-10 Thread Chirag Dewan via user
`CsvBulkWriter` and create `FileSink` by `FileSink.forBulkFormat`. You can see the detail in `DataStreamCsvITCase.testCustomBulkWriter` Best,Shammon On Tue, Mar 7, 2023 at 7:41 PM Chirag Dewan via user wrote: Hi, I am working on a Java DataStream application and need to implement a File sink with

CSV File Sink in Streaming Use Case

2023-03-07 Thread Chirag Dewan via user
Hi, I am working on a Java DataStream application and need to implement a File sink with CSV format. I see that I have two options here - Row and Bulk (https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/datastream/filesystem/#format-types-1) So for CSV file distribution wh

Avro 1.11 with Flink 1.14

2022-07-26 Thread Chirag Dewan via user
Hi, Is it possible to use Avro 1.11 with Flink 1.14? I know that Avro version is still at 1.10, but due to my job using Avro 1.11, I was planning to use it in Flink as well.  Also, I know that Avro 1.10 had some performance issues with Flink 1.12 ([FLINK-19440] Performance regression on 15.09.20