Fwd: Flink custom trigger use case

2021-02-23 Thread Diwakar Jha
Hello, posting again for help. I'm planning to use state TTL but would like to know if there is any other way to do it. I'm using Flink 1.11. Thanks! -- Forwarded message - From: Diwakar Jha Date: Mon, Feb 22, 2021 at 6:28 PM Subject: Flink custom trigger use case To: user He

Re: [Statefun] Dynamic behavior

2021-02-23 Thread Miguel Araújo
Hi Gordon, Igal, Thanks for your replies. PubSub would be a good addition, I have a few scenarios where that would be useful. However, after reading your answers I realized that your proposed solutions (which address the most obvious interpretation of my question) do not necessarily solve my prob

Object and Integer size in RocksDB ValueState

2021-02-23 Thread Maciej Obuchowski
Hey. We have deduplication job that has a large amount of keyed ValueState. We want to decrease state size as much as possible, so we're using ValueState as it's smallest possible Java non-primitive. However, as per https://www.baeldung.com/java-size-of-object (and my measurements) Java Integer ha

Re: WatermarkStrategy.for_bounded_out_of_orderness in table API

2021-02-23 Thread joris.vanagtmaal
Or is this only possible with the data stream api? I tried converting a table to a datastream of rows, but being a noob, finding examples of how to do this has been difficult and not sure how to provide the required RowTypeInfo. -- Sent from: http://apache-flink-user-mailing-list-archive.233605

Re: [Statefun] Dynamic behavior

2021-02-23 Thread Miguel Araújo
Another possibility I am considering is handling this in Flink using a broadcast and adding all the information needed to the event itself. I'm a little concerned about the amount of data that will be serialized and sent on every request though, as I'll need to include information about all availab

Re: WatermarkStrategy.for_bounded_out_of_orderness in table API

2021-02-23 Thread joris.vanagtmaal
I worked out the rowtype input for the conversion to datastream; type_info = Types.ROW_NAMED(['sender', 'stw', 'time'],[Types.STRING(), Types.DOUBLE(), Types.LONG()]) datastream=table_env.to_append_stream(my_table, type_info) But if i try to assign rowtime and watermarks to the datastream and con

Re: [Statefun] Dynamic behavior

2021-02-23 Thread Seth Wiesman
Hey Miguel, What you are describing is exactly what is implemented in this repo. The TransactionManager function acts as an orchestrator to work with the other functions. The repo is structured as an exercise but the full solution exists on the branch `advanced-solution`. https://github.com/verve

Re: Object and Integer size in RocksDB ValueState

2021-02-23 Thread Roman Khachatryan
Hi Maciej, If I understand correctly, you're asking whether ValueState parameterized with Object has the same size as the one with Integer (given that the actual stored objects (integers) are the same). With RocksDB, any state object is serialized first and only then it is stored in MemTable or in

Re: [Statefun] Dynamic behavior

2021-02-23 Thread Miguel Araújo
Hi Seth, Thanks for your comment. I've seen that repository in the past and it was really helpful to "validate" that this was the way to go. I think my question is not being addressed there though: how could one add dynamic behavior to your TransactionManager? In this case, state that is available

Re: BroadcastState dropped when data deleted in Kafka

2021-02-23 Thread Khachatryan Roman
Hi, Deletion of messages in Kafka shouldn't affect Flink state in general. Probably, some operator in your pipeline is re-reading the topic and overwrites the state, dropping what was deleted by Kafka. Could you share the code? Regards, Roman On Tue, Feb 23, 2021 at 7:12 AM bat man wrote: > H

Re: [Statefun] Dynamic behavior

2021-02-23 Thread Seth Wiesman
I don't think there is anything statefun specific here and I would follow Igals advice. Let's say you have a state value called `Behavior` that describes the behavior of an instance. There is a default behavior but any given instance may have a customized behavior. What I would do is the following

Re: Julia API/Interface for Flink

2021-02-23 Thread Khachatryan Roman
Hi, AFAIK there is no direct support for Julia in Flink currently. However, you may try to call Python from Julia using either Statefun Python SDK [1] or PyFlink [2]; or implement a remote Statefun module [3]. [1] https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/sdk/python.htm

Re: Install/Run Streaming Anomaly Detection R package in Flink

2021-02-23 Thread Roman Khachatryan
Hi, I'm pulling in Wei Zhong and Xingbo Huang who know PyFlink better. Regards, Roman On Mon, Feb 22, 2021 at 3:01 PM Robert Cullen wrote: > My customer wants us to install this package in our Flink Cluster: > > https://github.com/twitter/AnomalyDetection > > One of our engineers developed a

Re: WatermarkStrategy.for_bounded_out_of_orderness in table API

2021-02-23 Thread Roman Khachatryan
Hi, You can use watermark strategy with bounded out of orderness in DDL, please refer to [1]. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark Regards, Roman On Tue, Feb 23, 2021 at 12:48 PM joris.vanagtmaal < joris.vanagtm...@wartsila.com> w

Re: Flink custom trigger use case

2021-02-23 Thread Roman Khachatryan
Hi, I've noticed that you are using an event time window, but the trigger fires based on processing time. You should also register an event time timer (for the window end). So that trigger.onEventTime() will be called. And it's safer to check if the state (firstSeen) value is true, not just exists

Re: Flink custom trigger use case

2021-02-23 Thread Diwakar Jha
Hi Roman, Thanks for your reply! That was a typo, i'm using TumblingProcessingTimeWindows My problem is that i want to stop the first event trigger (per key) except for the first window. right now, my first event is getting triggered in every window. Will setting "*state (firstSeen) value is true

Flink jobs organization and maintainability

2021-02-23 Thread Sweta Kalakuntla
Hi, I am going to have to implement many similar jobs. I need guidance and examples that you may have for organizing them in the Git repository without having to have one repo per job. Thanks, SK --

Does the Kafka source perform retractions on Key?

2021-02-23 Thread Rex Fenley
Hi, I'm concerned about the impacts of Kafka's compactions when sending data between running flink jobs. For example, one job produces retract stream records in sequence of (false, (user_id: 1, state: "california") -- retract (true, (user_id: 1, state: "ohio")) -- append Which is consumed by Kafk

Re: Does the Kafka source perform retractions on Key?

2021-02-23 Thread Rex Fenley
Apologies, forgot to finish. If the Kafka source performs its own retractions of old data on key (user_id) for every append it receives, it should resolve this discrepancy I think. Again, is this true? Anything else I'm missing? Thanks! On Tue, Feb 23, 2021 at 6:12 PM Rex Fenley wrote: > Hi,

Re: Install/Run Streaming Anomaly Detection R package in Flink

2021-02-23 Thread Wei Zhong
Hi Robert, If you do not want to install the library on every machine of the cluster, the Python dependency management API can be used to upload and use the required dependencies to cluster. For this case, I recommend building a portable python environment that contains all the required depen

Window Process function is not getting trigger

2021-02-23 Thread sagar
I have simple flink stream program, where I am using socket as my continuous source I have window size of 2 seconds. Somehow my window process function is not triggering and even if I pass events in any order, flink is not ignoring I can see the output only when I kill my socket , please find the

Re: Window Process function is not getting trigger

2021-02-23 Thread Kezhu Wang
I saw one potential issue. Your timestamp assigner returns timestamp in second resolution while Flink requires millisecond resolution. Best, Kezhu Wang On February 24, 2021 at 11:49:59, sagar (sagarban...@gmail.com) wrote: I have simple flink stream program, where I am using socket as my contin

Re: Window Process function is not getting trigger

2021-02-23 Thread sagar
HI Corrected with below code, but still getting same issue Instant instant = p.getAsOfDateTime().atZone(ZoneId.systemDefault()).toInstant(); long timeInMillis = instant.toEpochMilli(); System.out.println(timeInMillis); return timeInMillis; On Wed, Feb 24, 2021 at 10:34 AM Kezhu Wang wrote: >

Re: Flink custom trigger use case

2021-02-23 Thread Khachatryan Roman
Hi Diwakar, I'm not sure I fully understand your question. If event handling in one window depends on some other windows than TriggerContext.getPartitionedState can not be used. Triggers don't have access to the global state (only to key-window scoped state). If that's what you want then please co

Re: Flink 1.11.2 test cases fail with Scala 2.12.12

2021-02-23 Thread soumoks
Thank you for the response but this error continues to happen with Scala 2.12.7. The app itself continues to compile without errors but the test cases fail with the same error. Seems to be related to https://issues.apache.org/jira/browse/FLINK-12461 I have set the Scala version in pom.xml file a

Re: BroadcastState dropped when data deleted in Kafka

2021-02-23 Thread bat man
Hi, This is my code below - As mentioned earlier the rulesStream us again used in later processing. Below you can see the rulesStream is again connected with the result stream of the first process stream. Do you think this is the reason rules operators state getting overridden when the data in kaf

Re: Datastream Lag Windowing function

2021-02-23 Thread Roman Khachatryan
Hi, I can't see neither wrong nor expected output in your message, can you re-attach it? Could you provide the code of your pipeline including the view creation? Are you using Blink planner (can be chosen by useBlinkPlanner [1])? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/d

Re: WatermarkStrategy.for_bounded_out_of_orderness in table API

2021-02-23 Thread Roman Khachatryan
The watermark resolution in Flink is one millisecond [1], so the 1st form essentially doesn't allow out-of-orderness (though the elements with the same timestamp are not considered late in this case). [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_timestamps_watermarks.html

Re: Window Process function is not getting trigger

2021-02-23 Thread sagar
It is fairly simple requirement, if I changed it to PRocessing time it works fine , but not working with event time..help appreciated! On Wed, Feb 24, 2021 at 10:51 AM sagar wrote: > HI > > Corrected with below code, but still getting same issue > > Instant instant = > p.getAsOfDateTime().atZon

Re: Flink 1.11.2 test cases fail with Scala 2.12.12

2021-02-23 Thread Chesnay Schepler
Coud you check your dependency tree for the version of scala-library? On 2/24/2021 7:28 AM, soumoks wrote: Thank you for the response but this error continues to happen with Scala 2.12.7. The app itself continues to compile without errors but the test cases fail with the same error. Seems to be