Sometimes Counter Metrics getting Stuck and not increasing

2021-05-21 Thread Prasanna kumar
Hi, We are publishing around 200 kinds of events for 15000 customers. Source Kafka Topics , Sink Amazon SNS Topic. We are collecting metrics in the following combination [Event , Consumer, PublishResult]. (Publish Result could be published or error). So Metrics count is in the order of 200*15000*

Data loss when connecting keyed streams

2021-05-21 Thread Alexis Sarda-Espinosa
Hello everyone, I just experienced something weird and I'd like to know if anyone has any idea of what could have happened. I have a simple Flink cluster version 1.11.3 running on Kubernetes with a single worker. I was testing a pipeline that connects 2 keyed streams and processes the result w

Re: behavior differences when sinking side outputs to a custom KeyedCoProcessFunction

2021-05-21 Thread Jin Yi
(sorry that the last sentence fragment made it into my email... it was a draft comment that i forgot to remove. my thoughts are/were complete in the first message.) i do have follow-up questions/thoughts for this thread though. given my current setup, it seems it's more expected to have the beha

Re: Writing ARRAY type through JDBC:PostgreSQL

2021-05-21 Thread Timo Walther
Hi Federico, if ARRAY doesn't work, this is definitely a bug. Either in the documentation or in the implementation. I will loop in Jingsong Li who can help. In any case, feel free to open a JIRA ticket already. Regards, Timo On 30.04.21 14:44, fgahan wrote: Hi Timo, I´m attaching the st

Re: Timestamp type mismatch between Flink, Iceberg, and Avro

2021-05-21 Thread Xingcan Cui
Hi Timo, Thanks for the reply! The document is really helpful. I can solve my current problem with some workarounds. Will keep an eye on this topic! Best, Xingcan On Fri, May 21, 2021, 12:08 Timo Walther wrote: > Hi Xingcan, > > we had a couple of discussions around the timestamp topic in Flin

Re: Timestamp type mismatch between Flink, Iceberg, and Avro

2021-05-21 Thread Timo Walther
Hi Xingcan, we had a couple of discussions around the timestamp topic in Flink and have a clear picture nowadays. Some background: https://docs.google.com/document/d/1gNRww9mZJcHvUDCXklzjFEQGpefsuR_akCDfWsdE35Q/edit# So whenever an instant or epoch time is required, TIMESTAMP_LTZ is the way

[ANNOUNCE] Apache Flink 1.12.4 released

2021-05-21 Thread Arvid Heise
Dear all, The Apache Flink community is very happy to announce the release of Apache Flink 1.12.4, which is the fourth bugfix release for the Apache Flink 1.12 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data

Timestamp type mismatch between Flink, Iceberg, and Avro

2021-05-21 Thread Xingcan Cui
Hi all, Recently, I tried to use Flink to write some Avro data to Iceberg. However, the timestamp representations for these systems really confused me. Here are some facts: - Avro uses `java.time.Instant` for logical type `timestamp_ms`; - Flink takes `java.time.Instant` as table type `T

Re: Is it possible to leverage the sort order in DataStream Batch Execution Mode?

2021-05-21 Thread Dawid Wysakowicz
I am afraid it is not possible to leverage the sorting for business logic. The sorting is applied on binary representation of the key as it is not necessary sorting per se, but rather grouping by the same keys. You can find more information in the FLIP of this feature e.g. here[1] Best, Dawid [1

Re: Kafka dynamic topic for Sink in SQL

2021-05-21 Thread Timo Walther
Hi Ben, if I remember correctly, this topic came up a couple of times. But we haven't implemented it yet, the existing implementation can be easily adapted for that. The "target topic" would be an additional persisted metadata column in SQL terms. All you need to do is to adapt DynamicKafkaS

Re: Unable to deserialize Avro data using Pyflink

2021-05-21 Thread Dian Fu
Hi Zerah, Sorry for late response. I agree with your analysis. Currently, to be used in Python DataStream API, we have to provide a Java implementation which could produce Row instead of GenericRecord. As far as I know, currently there is still no built-in DeserializationSchema which could prod

Re: Unable to deserialize Avro data using Pyflink

2021-05-21 Thread Zerah J
Hi Dian, On providing a Python implementation for ConfluentRegistryAvroDeserializationSchema, I could deserialize and print the confluent avro data using Pyflink. But since the GenericRecord returned by ConfluentRegistryAvroDeserializationSchema is not supported in PyFlink currently, I cannot per

Re: [Statefun] Truncated Messages in Python workers

2021-05-21 Thread Igal Shilman
Hi again, Something to mention in addition, it also could be the case that StateFun reaches a write timeout trying to write the accumulated batch to the remote function (when the remote functions are overloaded.) The requests are retried automatically, but still you way want to bump these timeouts

Re: Fail to cancel perJob for that deregisterApplication is not called

2021-05-21 Thread 刘建刚
Thank you for the answer. I met the same problem again. I add log in RestServerEndpoint's closeAsync method as following: @Override public CompletableFuture closeAsync() { synchronized (lock) { log.info("State is {}. Shutting down rest endpoint.", state); if (state == State.RUNNIN

Re: [Statefun] Truncated Messages in Python workers

2021-05-21 Thread Igal Shilman
Hi Jan, I haven't stumbled upon this but I will try to reconstruct that scenario with a stress test and report back. Can you share a little bit about your environment. For example do you use gunicorn, ngnix, aiohttp/or flask perhaps? I'd suggest maybe checking for request size limits parameters

退订

2021-05-21 Thread 郭斌
退订

Re: Issue with using siddhi extension function with flink

2021-05-21 Thread Salva Alcántara
Hi Dipanjan, I agree with Till. If the extensions are are included in the jar for your job, it should work. I was having the same doubts some weeks a go and can confirm that as long as the jar includes those extensions, it works. One thing I needed to do is to register the different extensions. F

Is it possible to leverage the sort order in DataStream Batch Execution Mode?

2021-05-21 Thread Marco Villalobos
Hello. I am using Flink 1.12.1 in EMR. I am processing historical time-series data with the DataStream API in Batch execution mode. I must average time series data into a fifteen minute interval and forward fill missing values. For example, this input: name, timestamp, value a,2019-06-23T00:07:

Re: Issue with using siddhi extension function with flink

2021-05-21 Thread Till Rohrmann
Hi Dipanjan, Please double check whether the libraries are really contained in the job jar you are submitting because if the library is contained in this jar, then it should be on the classpath and you should be able to load it. Cheers, Till On Thu, May 20, 2021 at 3:43 PM Dipanjan Mazumder wro