Re: Built-in functions to manipulate MULTISET type

2021-09-19 Thread Kai Fu
>>>> AFAIK, there is no built-in function to extract the keys in MULTISET >>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> >>>>> to >>>>> be an ARRAY. Define a UTF is a good solution. >>>

Built-in functions to manipulate MULTISET type

2021-09-17 Thread Kai Fu
Hi team, We want to know if there is any built-in function to extract the keys in MULTISET to be an ARRAY. There is no such function as far as we can find, except to define a simple wrapper UDF for that, please ad

Inspecting SST state of rocksdb

2021-08-08 Thread Kai Fu
Hi team, I'm trying to inspect SST files of flink's state with sst related tools like sst_dump, ldb in wiki . But it seems I'm getting meaningless results as shown below. The tools I'm using are from RocksDB's trunk and

Re: Regarding state access in UDF

2021-06-30 Thread Kai Fu
it it did. > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/functions/udfs/ > [2] > https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/datastream/user_defined_functions/ > > > Ingo > > On Thu, Jul 1, 2021 at 5:17 AM Kai Fu

Re: Regarding state access in UDF

2021-06-30 Thread Kai Fu
derstand it. Thanks! > > > Best > Ingo > > On Wed, Jun 30, 2021 at 3:37 PM Kai Fu wrote: > >> Hi team, >> >> We've a use case that needs to create/access state in UDF, while per the >> documentation >> <https://ci.apache.org/projects

Regarding state access in UDF

2021-06-30 Thread Kai Fu
Hi team, We've a use case that needs to create/access state in UDF, while per the documentation and UDF interface

Re: Questions about UPDATE_BEFORE row kind in ElasticSearch sink

2021-06-29 Thread Kai Fu
( > SELECT word, count(*) as cnt > FROM T > GROUP BY word > ) WHERE cnt < 3; > > There is more discussion in this issue: > https://issues.apache.org/jira/browse/FLINK-9528 > > Best, > Jark > > On Mon, 28 Jun 2021 at 13:52, Kai Fu wrote: > >

Questions about UPDATE_BEFORE row kind in ElasticSearch sink

2021-06-27 Thread Kai Fu
Hi team, We notice the UPDATE_BEFORE row kind is not dropped and treated as DELETE as in code

Re: Elasticsearch sink connector timeout

2021-06-04 Thread Kai Fu
1] 6041 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]* On Sat, Jun 5, 2021 at 12:13 PM Kai Fu wrote: > Hi team, > > We encountered an issue about ES sink connector timeout quite frequently. > As checked the ES cluster is far from being loaded(~40% CPU utilization, no > query, index rate

Elasticsearch sink connector timeout

2021-06-04 Thread Kai Fu
Hi team, We encountered an issue about ES sink connector timeout quite frequently. As checked the ES cluster is far from being loaded(~40% CPU utilization, no query, index rate is also low). We're using ES-7 connector, with 12 data nodes and parallelism of 32. The error log is as below, we want t

Re: Flink exported metrics scope configuration

2021-06-04 Thread Kai Fu
>semi-colon (;) separate list of variables that should be ignored by >tag-based reporters (e.g., Prometheus, InfluxDB). > > > > https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/metric_reporters/#reporter > > Best, > Mason > > On Jun

Flink exported metrics scope configuration

2021-06-03 Thread Kai Fu
Hi team, We noticed that Prometheus metrics exporter exports all of the metrics at the most fine-grained level, which is tremendous for the prometheus server especially when the parallelism is high. The metrics volume crawled from a single host(parallelism 8) is around 40MB for us currently. This

Re: Error about "Rejected TaskExecutor registration at the ResourceManger"

2021-06-01 Thread Kai Fu
. > Would you like provide complete jobmanager.log and taskmanager.log. Maybe > we could find some hints there. > > Best regards, > JING ZHANG > > Kai Fu 于2021年6月2日周三 上午7:23写道: > >> HI Till, >> >> Thank you for your response, per my observation that the pr

Re: Error about "Rejected TaskExecutor registration at the ResourceManger"

2021-06-01 Thread Kai Fu
lot to worry > about. > > Cheers, > Till > > On Sun, May 30, 2021 at 7:36 AM Kai Fu wrote: > >> Hi team, >> >> We encountered an issue during recovery from checkpoint. It's recovering >> because the downstream Kafka sink is full for a while and the

Re: Dynamic configuration of Flink checkpoint interval

2021-05-30 Thread Kai Fu
o encounter the same problem. Thanks very > much. > > Best regards, > JING ZHANG > > Kai Fu 于2021年5月30日周日 下午11:19写道: > >> Hi Jing, >> >> Yup, what you're describing is what I want. I also tried the approach you >> suggested and it works. I'm goi

Re: Dynamic configuration of Flink checkpoint interval

2021-05-30 Thread Kai Fu
e/FLINK>) . > At present, you may would like to use the following temporary solution: > 1. set a bigger value as checkpoint interval, start your job > 2. do a savepoint after cold start is completed > 3. set a normal value as checkpoint interval, restart the job from > savepoint

Dynamic configuration of Flink checkpoint interval

2021-05-30 Thread Kai Fu
Hi team, We want to know if Flink has some dynamic configuration of the checkpoint interval. Our use case has a cold start phase where the entire dataset is replayed from the beginning until the most recent ones. In the cold start phase, the resources are fully utilized and the backpressure is hi

Error about "Rejected TaskExecutor registration at the ResourceManger"

2021-05-29 Thread Kai Fu
Hi team, We encountered an issue during recovery from checkpoint. It's recovering because the downstream Kafka sink is full for a while and the job is failed and keeps trying to recover(The downstream is full for about 4 hours). The job cannot recover from checkpoint successfully even if after we

Re: UniqueKey constraint is lost with multiple sources join in SQL

2021-04-08 Thread Kai Fu
As identified with the community, it's bug and more information in issue https://issues.apache.org/jira/browse/FLINK-22113 On Sat, Apr 3, 2021 at 8:43 PM Kai Fu wrote: > Hi team, > > We have a use case to join multiple data sources to generate a > continuous updated view. We de

Re: Re: Meaning of checkpointStartDelayNanos

2021-04-04 Thread Kai Fu
> Best, > Yun > > > --Original Mail ------ > *Sender:*Kai Fu > *Send Date:*Mon Apr 5 09:31:58 2021 > *Recipients:*user > *Subject:*Re: Meaning of checkpointStartDelayNanos > >> I found its meaning i

Re: Questions about checkpointAlignmentTime in unaligned checkpoint

2021-04-04 Thread Kai Fu
snapshot would compose of > both the operator > snapshots and the snapshots of the skipped buffers. > > Therefore, the *checkpointAlignmentTime* metric still exists. > > Best, > Yun > > > ------Original Mail -- > *Sender:*Kai Fu >

Re: Meaning of checkpointStartDelayNanos

2021-04-04 Thread Kai Fu
to the current operator since it's intiated in the source. On Mon, Apr 5, 2021 at 9:21 AM Kai Fu wrote: > Hi team, > > I'm a little confused by the meaning of *checkpointStartDelayNanos*, I do > not understand what time it exactly means, but it seems it's a quite >

Meaning of checkpointStartDelayNanos

2021-04-04 Thread Kai Fu
Hi team, I'm a little confused by the meaning of *checkpointStartDelayNanos*, I do not understand what time it exactly means, but it seems it's a quite important indicator for checkpoint/backpresure. The explanation of it on metrics page

UniqueKey constraint is lost with multiple sources join in SQL

2021-04-03 Thread Kai Fu
Hi team, We have a use case to join multiple data sources to generate a continuous updated view. We defined primary key constraint on all the input sources and all the keys are the subsets in the join condition. All joins are left join. In our case, the first two inputs can produce *JoinKeyContai