Flink filesystem connector with regex support

2024-08-14 Thread amogh joshi
Hi, I am building a pretty straightforward processing pipeline as described below, using *DataStream* *APIs* and *FileSystem connector*. *filesystem-source -> transforms -> database-sink* Everything worked well till the filesystem (source) had just a single type (JSON) of files. Recently the fil

Re: Issue with protobuf kyro serializer for UnmodifiableLazyStringList

2024-08-14 Thread Sebastian Zapata
was able to make it work with adding a serializer import de.javakaffee.kryoserializers.protobuf.ProtobufSerializer environment.config.registerTypeWithKryoSerializer(EventMetaData::class.java, ProtobufSerializer::class.java) and then modifying the proto buff schema so it takes the nonlazy implemen

Re: Flink jobs failed in Kerberos delegation token

2024-08-14 Thread Gabor Somogyi
Proxy user has normally kerberos credentials and with that it's possible to fetch new HDFS tokens. When you can keep the proxy user kerberos credentials up-to-date (which is not an easy task) then the YARN side can potentially work. G On Wed, Aug 14, 2024 at 5:42 PM dpp wrote: > Thanks for the

Re: Flink jobs failed in Kerberos delegation token

2024-08-14 Thread Gabor Somogyi
Hi, Now I see the situation. In short from Fink point of view there is no possibility to refresh AM Container tokens since it's YARN responsibility so this is a known issue. I've reported and discussed this issue with the YARN guys back in the days when we've added token framework to Spark but sin

Issue with protobuf kyro serializer for UnmodifiableLazyStringList

2024-08-14 Thread Sebastian Zapata
Hi everyone wanted to check if somebody else has found an issue like this?, and could maybe have some pointer or has encountered this before my kyro serializer is failing to serialize com.google.protobuf.UnmodifiableLazyStringList I am tyring to read a protobuf message with the data streaming API

Re: flink operator - restart a pipeline from a manually trigger savepoint

2024-08-14 Thread Rion Williams
Hi Sigalit, To restart from an explicit savepoint, you need to not only set your initialSavepointPath but also the savepointRedeployNonce which should trigger the job to restart using that savepoint. Try setting/incrementing that property and you should see the job run from your manual savepo

flink operator - restart a pipeline from a manually trigger savepoint

2024-08-14 Thread Sigalit Eliazov
hi, We are trying to restart a pipeline from a save point we triggered manually via the job manager rest api. with the following configuration in the flinkdeployment crd: savepointTriggerNonce: 1 initialSavepointPath: upgradeMode: savepoint this always fails with the following error org.apach