Re: [External] : Re: How to get flink to use POJO serializer when enum is present in POJO class

2022-05-12 Thread Fuyao Li
Hi Tejas, Yes, you can write a typefactory for enum. But I am assuming Flink should be able to recognize enum by default… Anyways, you can do something like this: Types.ENUM(RuleType.class); This will return you a TypeInfomation which can be used to construct a typefactory.. BTW, could you t

Re: How to get flink to use POJO serializer when enum is present in POJO class

2022-05-12 Thread Tejas B
Hi Weihua, This is the error I am getting : Caused by: org.apache.flink.util.FlinkException: Could not restore operator state backend for CoBroadcastWithNonKeyedOperator_8c5504f305beefca0724b3e55af8ea26_(1/1) from any of the 1 provided restore options. at org.apache.flink.streaming.api.operators.Ba

Re: unsubscribe

2022-05-12 Thread Henry Cai
I did this, but I am still getting emails from the flink user group. On Wed, May 11, 2022 at 6:30 PM yuxia wrote: > To unsubscribe, you can send email to user-unsubscr...@flink.apache.org > with any object. > > Best regards, > Yuxia > > -- > *发件人: *"Henry Cai" > *收件

Re: [External] : Re: How to define TypeInformation for Flink recursive resolved POJO

2022-05-12 Thread Fuyao Li
Updated the FieldDefinition class inline to avoid confusion. I am just listing a few fields in the class (not all). It is all following suggested POJO approach. From: Fuyao Li Date: Thursday, May 12, 2022 at 09:46 To: Weihua Hu Cc: User Subject: Re: [External] : Re: How to define TypeInformat

Re: [External] : Re: How to define TypeInformation for Flink recursive resolved POJO

2022-05-12 Thread Fuyao Li
Hi Weihua, I am following all the standards mentioned here. The code structure is listed in the previous email. @Data Class Metadata { @TypeInfo(StringFieldDefinitionMapTypeInfoFactory.class) Map fields; @TypeInfo(StringSetTypeInfoFactory.class) private Set validColumns = new HashSet<

Pulsar/Flink Error: PulsarAdminException$NotFoundException: Topic not exist

2022-05-12 Thread Jason Kania
Hi, I am attempting to upgrade from 1.12.7 to 1.15.0. One of the issues I am encountering is the following exception when attempting to submit a job from the command line: switched from INITIALIZING to FAILED with failure cause:  org.apache.pulsar.client.admin.PulsarAdminException$NotFoundExcep

Re: Community fork of Flink Scala API for Scala 2.12/2.13/3.x

2022-05-12 Thread Jeff Zhang
That's true scala shell is removed from flink . Fortunately, Apache Zeppelin has its own scala repl for Flink. So if Flink can support scala 2.13, I am wondering whether it is possible to integrate it into scala shell so that user can run flink scala code in notebook like spark. On Thu, May 12, 20

Re: Community fork of Flink Scala API for Scala 2.12/2.13/3.x

2022-05-12 Thread Roman Grebennikov
Hi, AFAIK scala REPL was removed completely in Flink 1.15 (https://issues.apache.org/jira/browse/FLINK-24360), so there is nothing to cross-build. Roman Grebennikov | g...@dfdx.me On Thu, May 12, 2022, at 14:55, Jeff Zhang wrote: > Great work Roman, do you think it is possible to run in scala

Re: Community fork of Flink Scala API for Scala 2.12/2.13/3.x

2022-05-12 Thread Jeff Zhang
Great work Roman, do you think it is possible to run in scala shell as well? On Thu, May 12, 2022 at 10:43 PM Roman Grebennikov wrote: > Hello, > > As far as I understand discussions in this mailist, now there is almost no > people maintaining the official Scala API in Apache Flink. Due to some

Community fork of Flink Scala API for Scala 2.12/2.13/3.x

2022-05-12 Thread Roman Grebennikov
Hello, As far as I understand discussions in this mailist, now there is almost no people maintaining the official Scala API in Apache Flink. Due to some technical complexities it will be probably stuck for a very long time on Scala 2.12 (which is not EOL yet, but quite close to): * Traversable

Re: How to get flink to use POJO serializer when enum is present in POJO class

2022-05-12 Thread Weihua Hu
Hi, Tejas These code is works in my idea environment. Could you provide more error info or log? Best, Weihua > 2022年5月10日 下午1:22,Tejas B 写道: > > Hi, > I am trying to get flink schema evolution to work for me using POJO > serializer. But I found out that if an enum is present in the POJO then

Re: How to define TypeInformation for Flink recursive resolved POJO

2022-05-12 Thread Weihua Hu
Hi, Fuyao How did you define these classes? There is some requirements for POJO as flink docs[1] said: The class must be public. It must have a public constructor without arguments (default constructor). All fields are either public or must be accessible through getter and setter functions. F

Re: Batching in kinesis sink

2022-05-12 Thread Zain Haider Nemati
Thanks for your response ! Much appreciated On Thu, May 12, 2022 at 5:51 PM Teoh, Hong wrote: > Hi Zain, > > For Flink 1.13, we use the KinesisProducerLibrary. If you are using > aggregation, you can control the maximum size of aggregated records by > configuring the AggregationMaxSize in the pr

Re: Incorrect checkpoint id used when job is recovering

2022-05-12 Thread tao xiao
Forgot to mention the Flink version is 1.13.2 and we use kubernetes native mode On Thu, May 12, 2022 at 9:18 PM tao xiao wrote: > Hi team, > > I met a weird issue when a job tries to recover from JM failure. The > success checkpoint before JM crashed is 41205 > > ``` > > {"log":"2022-05-10 14:5

Incorrect checkpoint id used when job is recovering

2022-05-12 Thread tao xiao
Hi team, I met a weird issue when a job tries to recover from JM failure. The success checkpoint before JM crashed is 41205 ``` {"log":"2022-05-10 14:55:40,663 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Completed checkpoint 41205 for job 0

RE: Batching in kinesis sink

2022-05-12 Thread Teoh, Hong
Hi Zain, For Flink 1.13, we use the KinesisProducerLibrary. If you are using aggregation, you can control the maximum size of aggregated records by configuring the AggregationMaxSize in the producer config when constructing the FlinkKinesisProducer. (See [1] for more docs) producerConfig.put(

Re: How to test Flink SQL UDF with open method?

2022-05-12 Thread zhouhaifengmath
I register my job parameters as flink global parameters, and I need to get those parameters in udf's open method like: I know in DataStream API there are test harnesses to test user-defined functions as shows in docs:So I wonder is there a similar way in Table/SQL API to

Re: http stream as input data source

2022-05-12 Thread Alexander Preuß
Hi Harald, I was previously investigating this topic as well. There are some community efforts for HTTP sources, please have a look at the references below: https://getindata.com/blog/data-enrichment-flink-sql-http-connector-flink-sql-part-one/ https://github.com/getindata/flink-http-connector ht

Re:http stream as input data source

2022-05-12 Thread Xuyang
Hi, there have not been a http source currently, and you can build the custom data source manually just like Yuxia said. Yuxia has given you a quick way to build the custom connector by Table Api. But if you want to use DataStream api to do that, you can refer to here[1]. You can also open an i

Re: How to test Flink SQL UDF with open method?

2022-05-12 Thread Zhanghao Chen
Hi, What kind of parameters do you want to get, Flink global job parameters, or some other parameters? Best, Zhanghao Chen From: zhouhaifengmath Sent: Thursday, May 12, 2022 14:33 To: user@flink.apache.org Subject: How to test Flink SQL UDF with open method? H

Re: Converting from table to stream, following Avro schema

2022-05-12 Thread Dhavan Vaidya
Hey Dian, Though my HTTP call's response is indeed JSON, I needed to serialize data into Avro for Kafka sink. Since you said it supports Row inside Row, I investigated deeper and found out that since Row class *sorts by key names*, I had to strictly follow the order in output_type parameter in `p

Re: At-least once sinks and their behaviour in a non-failure scenario

2022-05-12 Thread Alexander Preuß
Hi Piotr, You are correct regarding the Savepoint, there should be no duplicates sent to RabbitMQ. Best regards, Alexander On Thu, May 12, 2022 at 11:28 AM Piotr Domagalski wrote: > Hi, > > I'm planning to build a pipeline that is using Kafka source, some stateful > transformation and a Rabbit

Batching in kinesis sink

2022-05-12 Thread Zain Haider Nemati
Hi, I am using a kinesis sink with flink 1.13. The amount of data is in millions and it choke the 1MB cap for kinesis data streams. Is there any way to send data to kinesis sink in batches of less than 1MB? or any other workaround

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-12 Thread Robert Metzger
+1 on setting up our own Slack instance (PMC owned) +1 for having a separate discussion about setting up a discussion forum (I like the idea of using GH discussions) Besides, we still need to investigate how > http://apache-airflow.slack-archives.org works, I think > a slack of our own can be easi

Re: Incompatible data types while using firehose sink

2022-05-12 Thread yuxia
Firehose implements sink2 which is introduced in Flink 1.15. But the method inputStream#sinkTo(xxx) only accepts sink1 in Flink 1.13. If you still want to use Firehose in Flink 1.13, I guess you may need to implement a SinkV2Adapter Or to t ranslates Sink V2 into Sink V1 like SinkV1Adapter in

At-least once sinks and their behaviour in a non-failure scenario

2022-05-12 Thread Piotr Domagalski
Hi, I'm planning to build a pipeline that is using Kafka source, some stateful transformation and a RabbitMQ sink. What I don't yet fully understand is how common should I expect the "at-least once" scenario (ie. seeing duplicates) on the sink side. The case when things start failing is clear to m

Re: [State Processor API] unable to load the state of a trigger attached to a session window

2022-05-12 Thread Dongwon Kim
Hi Juntao, Thanks a lot for taking a look at this. After a little inspection, I found that elements (window state) are stored > in namespace TimeWindow{start=1,end=11}, in your case, and trigger count > (trigger state) is stored in namespace TimeWindow{start=1,end=15}, but > WindowReaderOperator

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-12 Thread Jark Wu
Hi, I would +1 to create Apache Flink Slack for the lower barriers to entry as Jingsong mentioned. Besides, we still need to investigate how http://apache-airflow.slack-archives.org works, I think a slack of our own can be easier to set up the archive. Regarding Discourse vs Slack, I think they a

Data loss while writing to File System

2022-05-12 Thread Surendra Lalwani
Hi Team, When I am writing to S3 then in that case there are chances of data loss when let suppose you have in-progress/pending files and your job fails. In that case the data which was written to in-progress files is lost as even restoring from savepoint is not reading data again. Thanks and Reg

Re: Incompatible data types while using firehose sink

2022-05-12 Thread Martijn Visser
Hi, As far as I know, there is no Firehose sink in Flink 1.13, only a Kinesis one [1] Best regards, Martijn Visser https://twitter.com/MartijnVisser82 https://github.com/MartijnVisser [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/datastream/overview/ On Thu, 1

Re: [State Processor API] unable to load the state of a trigger attached to a session window

2022-05-12 Thread Juntao Hu
Sorry to make the previous mail private. My response reposted here: " After a little inspection, I found that elements (window state) are stored in namespace TimeWindow{start=1,end=11}, in your case, and trigger count (trigger state) is stored in namespace TimeWindow{start=1,end=15}, but WindowRead

Re: Incompatible data types while using firehose sink

2022-05-12 Thread Zain Haider Nemati
Hi, Appreciate your response. My flink version is 1.13. Is there any other way to sink data to kinesis without having to update to 1.15 On Thu, May 12, 2022 at 12:25 PM Martijn Visser wrote: > I'm guessing this must be Flink 1.15 since Firehose was added in that > version :) > > On Thu, 12 May 2

Re: Incompatible data types while using firehose sink

2022-05-12 Thread Martijn Visser
I'm guessing this must be Flink 1.15 since Firehose was added in that version :) On Thu, 12 May 2022 at 08:41, yu'an huang wrote: > Hi, > > Your code is working fine in my computer. What is the Flink version you > are using. > > > > > On 12 May 2022, at 3:39 AM, Zain Haider Nemati > wrote: > >

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-12 Thread Martijn Visser
Hi, I would +1 setting up our own Slack. It will allow us to provide the best experience for those in the community who want to use Slack. More than happy to help with setting up community guidelines on how to use Slack. Best regards, Martijn On Thu, 12 May 2022 at 05:22, Yun Tang wrote: > Hi

Re: http stream as input data source

2022-05-12 Thread yuxia
The quick answer is no. There's no http data stream on hand. You can implement one by yourself. Here[1] is a guidance about how to implemet user-defined source & sink Btw, there's a jira for http sink[2] but is marked as won't fix. [1] https://nightlies.apache.org/flink/flink-docs-master/do