Using the flink CLI option --pyRequirements

2021-10-18 Thread Francis Conroy
Hi, I'm trying to install some required modules by supplying a requirements file when submitting to the cluster and the CLI just seems to stall. I've built 1.15-SNAPSHOT@7578758fa8c84314b8b3206629b3afa9ff41b636 and have run the wordcount example, everything else seems to work, I just can't submit

Re: EKs FlinkK8sOperator for 1.20

2021-10-18 Thread Dhiru
Thanks Kim,   I got solution, we need to downgrade controller-gen@v0.2.4 to make this working  But thanks a lot  On Monday, October 18, 2021, 11:46:23 PM EDT, Youngwoo Kim (김영우) wrote: Hi Dhiru, Take a look at this flink operator,  https://github.com/spotify/flink-on-k8s-operatorThe o

Re: EKs FlinkK8sOperator for 1.20

2021-10-18 Thread 김영우
Hi Dhiru, Take a look at this flink operator, https://github.com/spotify/flink-on-k8s-operator The operator is forked and even enhanced by Soptify devs and contributors. Looks like it works on k8s 0.20+ Thanks, Youngwoo On Mon, Oct 18, 2021 at 12:05 AM Dhiru wrote: > hi , >I was plannin

Re: SplitFetcherManager custom error handler

2021-10-18 Thread Qingsheng Ren
Hi Mason, It’ll be great to have your contribution! Also could you provide more specific descriptions about your use case? It looks like you are implementing a custom Kafka connector so I’m not sure if handling the exception directly in the split reader is a possible solution. -- Best Regards,

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-18 Thread JING ZHANG
Hi Mans, > > Is there a DataStream api for using the upsert functionality ? > You could try use `JdbcSink#sink` method, pass a upsert query as first parameter value. However, there is no standard syntax for upsert, you need to check whether the external database supports upsert or not. If yes, what

Re: Problem with Flink job and Kafka.

2021-10-18 Thread Qingsheng Ren
Hi Marco, Sorry I forgot to cc the user mailing list just now. From the exception message it looks like a versioning issue. Could you provide some additional information, such as Flink & Kafka connector version, Kafka broker version, and full exception stack? Also it will be helpful to paste pa

Re: Avro UTF-8 deserialization on integration DataStreams API -> Table API (with Confluent SchemaRegistry)

2021-10-18 Thread Caizhi Weng
Hi! You can call streamSource.processRecord to change the CharSequence to a String, then change the stream to a table. Peter Schrott 于2021年10月18日周一 下午8:40写道: > Hi there, > > I have a Kafka topic where the schema of its values is defined by the > "MyRecord" record in the following Avro IDL and r

Flink 1.14.0 reactive mode cannot rescale

2021-10-18 Thread 陳昌倬
Hi, We found that Flink 1.14.0 cannot rescale when using the following configuration: * Kubernetes per-job session mode * Reactive mode * Unaligned checkpoint * Latest checkpoint type is checkpoint, not savepoint It is, however, can rescale from savepoint. The following is redacted log when er

Problem with Flink job and Kafka.

2021-10-18 Thread Marco Villalobos
I have the simplest Flink job that simply deques off of a kafka topic and writes to another kafka topic, but with headers, and manually copying the event time into the kafka sink. It works as intended, but sometimes I am getting this error: org.apache.kafka.common.errors.UnsupportedVersionExcepti

SplitFetcherManager custom error handler

2021-10-18 Thread Mason Chen
Hi all, I am implementing a Kafka connector with some custom error handling that is aligned with our internal infrastructure. `SplitFetcherManager` has a hardcoded error handler in the constructor and I was wondering if it could be exposed by the classes that extend it. Happy to contribute if peop

Re: [External] : Timeout settings for Flink jobs?

2021-10-18 Thread Fuyao Li
I don’t know any out of the box solution for the use case you mentioned. You can add an operator to orchestrate your Flink clusters, when certain conditions are met, trigger a stop with savepoint will achieve something like you mentioned. Maybe Arvid can share more information. From: Sharon Xie

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-18 Thread M Singh
Hi Jing: Thanks for your response and example. Is there a DataStream api for using the upsert functionality ? Also, is there any reason for why the TableJdbcUpsertOutputFormat constructors are not public ?  Thanks again for your help. Mans On Monday, October 18, 2021, 12:30:36 AM EDT, JING ZH

Re: [External] : Re: How to enable customize logging library based on SLF4J for Flink deployment in Kubernetes

2021-10-18 Thread Fuyao Li
Hi Chesnay, Thanks for the reply. 1. The internal logging framework is built upon slf4j/log4j2 (The same one Flink uses, but it comes with an additional POM dependency). I have added such dependency in the Flink application POM file. But it seems only to work locally in IDE. When it is in th

Re: [External] : Timeout settings for Flink jobs?

2021-10-18 Thread Sharon Xie
It's promising that I can #isEndOfStream at the source. Is there a way I can terminate a job from the sink side instead? We want to terminate a job based on a few conditions (either hit the timeout limit or the output count limit). On Mon, Oct 18, 2021 at 2:22 AM Arvid Heise wrote: > Unfortunate

Re: HBase sink connector - HBaseSinkFunction vs Table API

2021-10-18 Thread Martijn Visser
Hi Anton, I'm not sure why you would prefer writing your own sink connector over using a provided one. If you really want to use the DataStream API, you can also switch from the Table API to the DataStream API and back. Best regards, Martijn (I noticed I forgot to include the user mailing list,

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Thomas Weise
Thanks for initiating this discussion. There are definitely a few things that are not optimal with our current management of connectors. I would not necessarily characterize it as a "mess" though. As the points raised so far show, it isn't easy to find a solution that balances competing requiremen

Re: NoClassDefFoundError if Kafka classes are assemblied in application jar

2021-10-18 Thread L. C. Hsieh
Hi Arvid, Alexander Preuß has already replied to me and I also found a discussion on https://stackoverflow.com/questions/51479657/flinkkafkaconsumer011-not-found-on-flink-cluster. So by following https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/project-configuration/

HBase sink connector - HBaseSinkFunction vs Table API

2021-10-18 Thread Anton
Hello. Please suggest best method to write data to HBase (stream going from Kafka being enriched with HBase data and need to be written to HBase). There is only one connector on flink.apache.org related to Table API. At the same time there is HBaseSinkFunction in the source code and I beleive it re

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Chesnay Schepler
Generally, the issues are reproducibility and control. Stuffs completely broken on the Flink side for a week? Well then so are the connector repos. (As-is) You can't go back to a previous version of the snapshot. Which also means that checking out older commits can be problematic because you'd

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Chesnay Schepler
I think you're misinterpreting my comment. Independent from the repo split we should only keep the connectors in the Flink project that we actively maintain. The rest we might as well just drop. If some external people are interested in maintaining these connectors then there's nothing stoppin

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Arvid Heise
Hi folks, thanks for joining the discussion. I'd like to give some ideas on how certain concerns are going to be addressed: Ingo: > In general I think breaking up the big repo would be a good move with many > benefits (which you have outlined already). One concern would be how to > proceed with o

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Leonard Xu
Hi, all I understand very well that the maintainers of the community want to move the connector to an external system. Indeed, the development and maintenance of the connector requires a lot of energy, and these do not involve the Flink core framework, which can reduce the maintenance pressure

Avro UTF-8 deserialization on integration DataStreams API -> Table API (with Confluent SchemaRegistry)

2021-10-18 Thread Peter Schrott
Hi there, I have a Kafka topic where the schema of its values is defined by the "MyRecord" record in the following Avro IDL and registered to the Confluent Schema Registry: @namespace("org.example") protocol MyProtocol { record MyRecord { string text; } } The topic is consumed with a

Re: Reset of transient variables in state to default values.

2021-10-18 Thread Arvid Heise
That's what I would try out, but I'm not sure if the statebackend would pick that up. @Yun Tang do you know more? On Mon, Oct 18, 2021 at 9:37 AM Alex Drobinsky wrote: > Hi Arvid, > > It sounds like a good direction, do I need to register my state class with > KryoSerializer , similar to this ?

Re: Removing metrics

2021-10-18 Thread Arvid Heise
Hi Mason, I created FLINK-24574 [1] to track this feature request. I think it's a very valid use case. [1] https://issues.apache.org/jira/browse/FLINK-24574 On Fri, Oct 15, 2021 at 9:58 AM JING ZHANG wrote: > Hi Mason, > I'm afraid there is no way for users to actively > remove/unregister metr

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread David Morávek
We are mostly talking about the freedom this would bring to the connector authors, but we still don't have answers for the important topics: - How exactly are we going to maintain the high quality standard of the connectors? - How would the connector release cycle to look like? Is this going to af

Re: EKs FlinkK8sOperator for 1.20

2021-10-18 Thread David Morávek
Hi Dhiru, What is the actual issue / failure that you've encountered when trying to deploy the operator into EKS cluster? In general, if you're running into any specific EKS issues with the operator, I'd say the best approach would be reaching out to its authors / community around it, as we have

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Qingsheng Ren
Thanks for driving this discussion Arvid! I think this will be one giant leap for Flink community. Externalizing connectors would give connector developers more freedom in developing, releasing and maintaining, which can attract more developers for contributing their connectors and expand the Fl

Re: dataStream can not use multiple classloaders

2021-10-18 Thread Arvid Heise
You also must ensure that your SourceFunction is serializable, so it's not enough to just refer to some classloader, you must ensure that you have access to it also after deserialization on the task managers. On Mon, Oct 18, 2021 at 4:24 AM Caizhi Weng wrote: > Hi! > > There is only one classloa

Re: I/O reactor status: STOPPED after moving to elasticsearch7 connector

2021-10-18 Thread Arvid Heise
Hi Oran, could you check if smaller batches improve the situation? On Sat, Oct 16, 2021 at 2:15 AM Oran Shuster wrote: > The cluster is not really overloaded and also couldn't find some ES errors > logs (atleast something that is repeating) > The job IS very busy (100% on the sink and backpress

Re: [External] : Timeout settings for Flink jobs?

2021-10-18 Thread Arvid Heise
Unfortunately, DeserializationSchema#isEndOfStream is only ever supported for KafkaConsumer. It's going to be removed entirely, once we drop the KafkaConsumer. For newer applications, you can use KafkaSource, which allows you to specify an end offset explicitly. On Fri, Oct 15, 2021 at 7:05 PM Fu

Re: [Statefun] Unable to locate the launcher jar

2021-10-18 Thread Igal Shilman
Forgot to include the user mailing list in my previous email. On Fri, Oct 15, 2021 at 12:27 PM Igal Shilman wrote: > Hello, > > Is there a specific reason you are using the 2.x branch? This is quite old > and most importantly it is not compatible with the 3.x branch. > If you are starting a new

Re: Flink 1.11 loses track of event timestamps and watermarks after process function

2021-10-18 Thread Arvid Heise
processFunction will just emit watermarks from upstream as they come. No function/operator in Flink is a black hole w.r.t. watermarks. It's just important to remember that watermark after a network shuffle is always the min of all inputs (ignoring idle inputs). So if any connected upstream part is

Re: Exception: SequenceNumber is treated as a generic type

2021-10-18 Thread Ori Popowski
Got that, thanks. I'll try On Mon, Oct 18, 2021 at 11:50 AM Arvid Heise wrote: > If you submit a fat jar to Flink, it contains the Kinesis connector. Dawid > was suggesting to also add the SequenceNumber to your src with the original > package name such that you effectively overwrite the class o

Re: NoClassDefFoundError if Kafka classes are assemblied in application jar

2021-10-18 Thread Arvid Heise
This looks very odd. How do you create the fat jar? What's your Flink version? I don't think this is a general Flink issue or else no one would be able to read from Kafka at all. On Fri, Oct 15, 2021 at 4:16 AM L. C. Hsieh wrote: > Hi, Flink developers, > > Does anyone encounter the following e

Re: How to refresh topics to ingest with KafkaSource?

2021-10-18 Thread Arvid Heise
Hi Preston, if you still need to set KafkaSubscriber explicitly, could you please create a feature request for that? For now, you probably have to resort to reflection hacks and build against a the non-public KafkaSubscriber. On Fri, Oct 15, 2021 at 4:03 AM Prasanna kumar < prasannakumarram...@gm

Re: Exception: SequenceNumber is treated as a generic type

2021-10-18 Thread Arvid Heise
If you submit a fat jar to Flink, it contains the Kinesis connector. Dawid was suggesting to also add the SequenceNumber to your src with the original package name such that you effectively overwrite the class of Kinesis while creating the fat jar (there should be warning and you should double-chec

Re: Let PubSubSource support changing subscriptions?

2021-10-18 Thread Arvid Heise
Hi Sayuan, I'm not familiar with PubSub and can't assess if that's a valid request or not. Maybe Niels can help as he worked on the last connector feature. In any case, you can create a ticket and even submit a PR if you want once the ticket is assigned to you. Best, Arvid On Thu, Oct 14, 2021

Re: Flink-1.12 Sql on Job two SQL sink control order

2021-10-18 Thread Arvid Heise
Are you running in Batch? Then you probably need to write 2 SQL jobs (or statements). In streaming, the notion of order doesn't make much sense. But maybe I misunderstood your use case. On Thu, Oct 14, 2021 at 11:37 AM Francesco Guardiani < france...@ververica.com> wrote: > I'm not aware of any

Re: How to deserialize Avro enum type in Flink SQL?

2021-10-18 Thread Arvid Heise
Just as an idea for a workaround as Flink apparently expects the enum field to be nullable. record MyEntry { MyEnumType type; <- make that nullable } Of course that is only an option if you are able to change the producer. On Thu, Oct 14, 2021 at 11:17 AM Francesco Guardiani < france...@

Re: [Statefun] Unable to locate the launcher jar

2021-10-18 Thread Arvid Heise
Do you use docker or some standalone cluster? If it's the latter, did you ensure that each cluster node has access to the jars? On Thu, Oct 14, 2021 at 8:40 AM Le Xu wrote: > Hello! > > I was trying to run the python greeter example >

Re: jdbc connector configuration

2021-10-18 Thread Arvid Heise
K8s should not restart a finished job. Are you seeing this? How did you configure the job? On Wed, Oct 13, 2021 at 7:39 AM Qihua Yang wrote: > Hi, > > If I configure batch mode, application will stop after the job is > complete, right? Then k8s will restart the pod and rerun the job. That is > n

Re: Reset of transient variables in state to default values.

2021-10-18 Thread Alex Drobinsky
Hi Arvid, It sounds like a good direction, do I need to register my state class with KryoSerializer , similar to this ? env.getConfig().registerTypeWithKryoSerializer(IPSessionOrganizer.proto.SourceOutput.class, ProtobufSerializer.class); пн, 18 окт. 2021 г. в 10:32, Arvid Heise : > Hi Alex,

Re: Reset of transient variables in state to default values.

2021-10-18 Thread Arvid Heise
Hi Alex, could you also log the identifity hashcode (or something similar) of the related instance? I suspect that it's not the field that is set to null but that you get a clone where the field is null. In that case, you need to add a specific KryoSerializer to initialize it (or just go with a la

Re: offset of TumblingEventTimeWindows

2021-10-18 Thread Arvid Heise
Hi, I'm not quite sure if I understood the question correctly. The second parameter of TumblingEventTimeWindows.of(Time size, Time offset) has an independent time scale of the first parameter. It doesn't matter if your actual windowing time is in minutes or seconds. However, if your size is muc

Re: Any issues with reinterpretAsKeyedStream when scaling partitions?

2021-10-18 Thread JING ZHANG
Hi Dan, > I'm guessing I violate the "The second operator needs to be single-input (i.e. no TwoInputOp nor union() before)" part. I think.you are right. Do you want to remove shuffle of two inputs in your case? If yes, Flink provides support for multiple input operators since 1.11 version. I think