Re: Registering UDAF in blink batch app

2020-04-14 Thread Timo Walther
Hi Dmytro, table function will be supported in Flink 1.11 with the new type system. Hopefully, we can also support aggregate functions until then. Regards, Timo On 14.04.20 15:33, godfrey he wrote: Hi Dmytro, Currently, TableEnvironment does not support register AggregationFunction and Tab

Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-14 Thread Timo Walther
Hi Oleg, this sounds indeed like abnormal behavior. Are you sure that these large checkpoints are related to the Kafka consumer only? Are there other operators in the pipeline? Because internally the state kept in a Kafka consumer is pretty minimal and only related to Kafka partition and offs

Re: Flink job didn't restart when a task failed

2020-04-14 Thread Zhu Zhu
Sorry I made a mistake. Even if it's the case I had guessed, you will not get a log "Task {} is already in state FAILED." because that task was already unregistered before trying to update the state to JM. Unfortunately currently we have no log which can be used to prove it. Just to confirm that th

Re: Flink sql Session window

2020-04-14 Thread Timo Walther
Hi, currently we don't provide more flexible windowing semantics in SQL. For this, a programmatic API like the DataStream API is a better fit with custom triggers and other more advanced features. Regards, Timo On 14.04.20 13:31, snack white wrote: Hi, In flink sql session window, is ther

Re: Flink

2020-04-14 Thread Timo Walther
Hi Navneeth, it might be also worth to look into Ververica Plaform for this. The community edition was published recently is free of charge. It provides first class K8s support [1]. There is also a tutorial how to deploy it on EKS [2] (not the most recent one through). Regards, Timo [1]

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-14 Thread Jiahui Jiang
Good to know! Thank you so much for all the responses again :) From: Jark Wu Sent: Tuesday, April 14, 2020 10:51 PM To: godfrey he Cc: Jiahui Jiang ; user Subject: Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvir

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-14 Thread Jark Wu
Hi Jiahui, Thanks for the inputs. It's a very common scenario to set specific configuration on some dedicate operators (e.g. parallelism, join strategy). And supporting query hints is definitely on our roadmap, but may happen in 1.12. Support state ttl in query hints sounds reasonable to me. Best

Re: sub

2020-04-14 Thread Sivaprasanna
Hi, To subscribe, you have to send a mail to user-subscr...@flink.apache.org On Wed, 15 Apr 2020 at 7:33 AM, lamber-ken wrote: > user@flink.apache.org >

sub

2020-04-14 Thread lamber-ken
user@flink.apache.org

Flink

2020-04-14 Thread Navneeth Krishnan
Hi All, I'm very new to EKS and trying to deploy a flink job in cluster mode. Are there any good documentations on what are the steps to deploy on EKS? >From my understanding, with flink 1.10 running it on EKS will automatically scale up and down with kubernetes integration based on the load. Is

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-14 Thread godfrey he
Hi Jiahui, Thanks for your suggestions. I think we may need more detailed explanation about the behavior change. Regarding to "supporting query configuration using Hints", I think it's a one kind of approach, but we need more discussion. Best, Godfrey Jiahui Jiang 于2020年4月14日周二 下午7:46写道: > Yep

Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-14 Thread Oleg Vysotsky
Hello, Sometime our flink job starts creating large checkpoints which include 55 Gb (instead of 2 MB) related to kafka source. After the flink job creates first “abnormal” checkpoint all next checkpoints are “abnormal” as well. Flink job can’t be restored from such checkpoint. Restoring from th

Re: Flink job didn't restart when a task failed

2020-04-14 Thread Hanson, Bruce
Hi Zhu Zhu (and Till), Thanks for your thoughts on this problem. I do not see a message like the one you mention "Task {} is already in state FAILED." I have attached a file with all the task manager logs that we received at the time this happened. As you see, there aren’t many. We turned on de

Re: Question about Writing Incremental Graph Algorithms using Apache Flink Gelly

2020-04-14 Thread Kaan Sancak
Thanks for the useful information! It seems like a good and fun idea to experiment. I will definitely give it a try. I have a very close upcoming deadline and I have already implemented the Scatter-Gather iteration algorithm. I have another question on whether we can chain Scatter-Gather or Ve

Processing Message after emitting to Sink

2020-04-14 Thread KristoffSC
Hi all, I have a special use case that I'm not sure how I can fulfill. The use case is: I have my main business processing pipe line that has a MQ source, processFunction1, processFunction2 and MQ sink PocessFunction1 apart from processing the main business message is also emitting some side eff

Re: Upgrading Flink

2020-04-14 Thread Chesnay Schepler
The only guarantee that Flink provides is that any /jar/ working against Public API's will continue to work without recompilation. There are no compatibility guarantees between clients<->server of different versions. On 14/04/2020 20:02, David Anderson wrote: @Chesnay Flink doesn't seem to gu

Re: [Stateful Functions] Using Flink CEP

2020-04-14 Thread Oytun Tez
Hi Igal, I have use cases such as "when a translator translates 10 words within 30 seconds". Typically, it is beautiful to express these with CEP. Yet, these are exploration questions where I try to replicate our Flink application in Statefun. Rephrasing problems better might be what's needed to

Re: [Stateful Functions] Using Flink CEP

2020-04-14 Thread Oytun Tez
Hi Gordon, Getting a little closer to Flink API could be helpful here with integration. DataStreams as ingress/egress would be AMAZING. Deploying regular Flink API code and statefun together as a single job is also something I will explore soon. With CEP, I simply want to keep a Function-specific

Re: Upgrading Flink

2020-04-14 Thread David Anderson
@Chesnay Flink doesn't seem to guarantee client-jobmanager compability, even for bug-fix releases. For example, some jobs compiled with 1.9.0 don't work with a cluster running 1.9.2. See https://github.com/ververica/sql-training/issues/8#issuecomment-590966210 for an example of a case when recompil

post-checkpoint watermark out of sync with event stream?

2020-04-14 Thread Cliff Resnick
We have an event-time pipeline that uses a ProcessFunction to accept events with an allowed lateness of a number of days. We a BoundedOutOfOrdernessTimestampExtractor and our event stream has a long tail that occasionally exceeds our allowed lateness, in which case we drop the events. The logic is

Re: [1.10.0] flink-dist source jar is empty

2020-04-14 Thread Steven Wu
Chesnay, sorry it was my mistake. yes, we did have a local change of for the shade plugin that I missed when porting local changes from 1.9 to 1.10. true On Tue, Apr 14, 2020 at 6:29 AM Chesnay Schepler wrote: > I just built the 1.8 and 1.9 flink-dist jars and neither contain the > sources of a

Re: Registering UDAF in blink batch app

2020-04-14 Thread godfrey he
Hi Dmytro, Currently, TableEnvironment does not support register AggregationFunction and TableFunction, because type extractor has not been unified for Java and Scala. One approach is we can use "TableEnvironment#createFunction" which will register UDF to catalog. I find "createTemporarySystemFun

Re: [1.10.0] flink-dist source jar is empty

2020-04-14 Thread Chesnay Schepler
I just built the 1.8 and 1.9 flink-dist jars and neither contain the sources of any bundled modules. How were you building the jars, and were you making any modifications to the Flink source? On 14/04/2020 15:07, Steven Wu wrote: flink-dist is a uber/shadow jar. before 1.10, its source jar co

Re: [1.10.0] flink-dist source jar is empty

2020-04-14 Thread Steven Wu
flink-dist is a uber/shadow jar. before 1.10, its source jar contains the source files for the flink modules that it bundles. On Tue, Apr 14, 2020 at 1:34 AM Chesnay Schepler wrote: > That should not be a problem since the flink-dist module does not > contain any java sources > > On 14/04/2020 0

Re: Question about Writing Incremental Graph Algorithms using Apache Flink Gelly

2020-04-14 Thread Flavio Pompermaier
>From what I see Gelly is not really maintained or used anymore..do you think it could make sense to deprecate it and write a guide (on the documentation) about how to rewrite a Gelly app into a Statefun one? On Tue, Apr 14, 2020 at 5:16 AM Tzu-Li (Gordon) Tai wrote: > Hi, > > As you mentioned,

Re: Upgrading Flink

2020-04-14 Thread Sivaprasanna
Ideally if the underlying cluster where the job is being deployed changes (1.8.x to 1.10.x ), it is better to update your project dependencies to the new version (1.10.x), and hence you need to recompile the jobs. On Tue, Apr 14, 2020 at 3:29 PM Chesnay Schepler wrote: > @Robert Why would he hav

Flink sql Session window

2020-04-14 Thread snack white
Hi, In flink sql session window, is there a way to finish a session window except of session gap ? ex. Session window size reach a limit. Thanks, white

Objects with fields that are not serializable

2020-04-14 Thread Dominik Wosiński
Hey, I have a question about using classes with fields that are not serializable in DataStream. Basically, I would like to use the Java's Optional in DataStream. So Say I have a class *Data *that has several optional fields and I would like to have *DataStream*. I don't think this should cause any

Re: Registering UDAF in blink batch app

2020-04-14 Thread Zhenghua Gao
`StreamTableEnvironment.create()` yields a `StreamTableEnvironmentImpl` object, which has several `registerFunction` interface for ScalarFunction/TableFunction/AggregateFunction/TableAggregateFunction. `TableEnvironment.create()` yields a `TableEnvironmentImpl` object, which is a unify entry point

Re: Upgrading Flink

2020-04-14 Thread Chesnay Schepler
@Robert Why would he have to recompile the jobs? Shouldn't he be fine soo long as he isn't using any API for which we broke binary-compatibility? On 09/04/2020 09:55, Robert Metzger wrote: Hey Stephen, 1. You should be able to migrate from 1.8 to 1.10: https://ci.apache.org/projects/flink/fli

Registering UDAF in blink batch app

2020-04-14 Thread Dmytro Dragan
Hi All, Could you please tell how to register custom Aggregation function in blink batch app? In case of streaming mode: We create EnvironmentSettings bsSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build(); StreamTableEnvironment tableEnv = StreamTableEnviron

Re: Possible memory leak in JobManager (Flink 1.10.0)?

2020-04-14 Thread Marc LEGER
Hello, Actually, I agree I do not need to have such an aggressive checkpoint period for my jobs, so I increased the checkpoint period from 1 to 10s and JobManager memory consumption is now quite stable for 3 days in my Flink 1.10.0 cluster. Thanks a lot for your help. Best regards, Marc Le ven.

Re: [Stateful Functions] Using Flink CEP

2020-04-14 Thread Igal Shilman
Hi, I'm not familiar with the other library that you have mentioned, and indeed using Flink CEP from within a stateful function is not possible within a single Flink job, as Gordon mentioned. I'm wondering what aspects of CEP are you interested in? Because essentially a stateful function can be

Re: Can I use Apache-Flink for Android API-Level < 26?

2020-04-14 Thread Chesnay Schepler
I agree with your conclusion that you cannot use Flink on an API Level below 26. I do not know whether it will work even with Level 26 though, as I'm not aware of anyone having tried it. On 14/04/2020 11:03, Alexander Borgschulze wrote: I am trying to use Apache-Flink in my Android-Project wi

Re: [Stateful Functions] Using statefun for E2E testing

2020-04-14 Thread Igal Shilman
Hi, I'm glad to hear that your PoC with StateFun functions has turned out to be successful, even if it is for verifying external systems are integrating with each other correctly. I hope that eventually StateFun would replace the 3 external systems :-) Good luck, Igal. On Fri, Apr 10, 2020 at 3

Re: Javadocs Broken?

2020-04-14 Thread Chesnay Schepler
I'm looking into it. On 10/04/2020 11:27, tison wrote: Hi guys, Right now when I click "JavaDocs" in out docsite[1] it jumps to a page[2] I think is definitely not out api documentation. Any thoughts? Best, tison. [1] https://ci.apache.org/projects/flink/flink-docs-master/ [2] https://ci.ap

Can I use Apache-Flink for Android API-Level < 26?

2020-04-14 Thread Alexander Borgschulze
I am trying to use Apache-Flink in my Android-Project with "minSdkVersion 24". Unfortunately, the following code causes an error:           val env: StreamExecutionEnvironment = LocalStreamEnvironment.getExecutionEnvironment()     env.streamTimeCharacteristic = TimeCharacteristic.ProcessingTime

Re: [1.10.0] flink-dist source jar is empty

2020-04-14 Thread Chesnay Schepler
That should not be a problem since the flink-dist module does not contain any java sources On 14/04/2020 06:42, Steven Wu wrote: We build and publish flink-dist locally. But the source jar turns out empty. Other source jars (like flink-core) are good. Anyone else experienced similar problem?

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-14 Thread godfrey he
Hi Jiahui, I think this is the problem of multiple sinks optimization. If we optimize each sink eager (that means we optimize the query when we call `writeToSink` or `insertInto`), `TableConfig#setIdleStateRetentionTime` is functionally equivalent to QueryConfig. which require we need call `Table