TumblingProcessingTimeWindow emits extra results for a same window

2018-07-12 Thread Yuan,Youjun
Hi community, I have a job which counts event number every 2 minutes, with TumblingWindow in ProcessingTime. However, it occasionally produces extra DUPLICATED records. For instance, for timestamp 153136848 below, it emits a normal result (cnt=1641161), and then followed by a few more recor

答复: TumblingProcessingTimeWindow emits extra results for a same window

2018-07-12 Thread Yuan,Youjun
x27;2' MINUTE), userId thanks Youjun 发件人: Timo Walther 发送时间: Thursday, July 12, 2018 5:02 PM 收件人: user@flink.apache.org 主题: Re: TumblingProcessingTimeWindow emits extra results for a same window Hi Yuan, this sounds indeed weird. The SQL API uses regular DataStream API wind

答复: 答复: TumblingProcessingTimeWindow emits extra results for a same window

2018-07-12 Thread Yuan,Youjun
;:"user01","min_ts":1531447919981,"max_ts":1531448159975} {"timestamp":1531448160000,"cnt":3278178,"userId":"user01","min_ts":1531448159098,"max_ts":1531448399977} {"timestamp":153144816,"cnt"

答复: 答复: 答复: TumblingProcessingTimeWindow emits extra results for a same window

2018-07-15 Thread Yuan,Youjun
: Yuan,Youjun 抄送: Timo Walther ; user@flink.apache.org 主题: Re: 答复: 答复: TumblingProcessingTimeWindow emits extra results for a same window Hi Youjun, The rowtime value in udf:EXTRACT(EPOCH FROM rowtime) is different from the rowtime value of window. Sql will be parsed and translated into some nodes

Best way to find the current alive jobmanager with HA mode zookeeper

2018-07-24 Thread Yuan,Youjun
Hi all, I have a standalone cluster with 3 jobmanagers, and set high-availability to zookeeper. Our client submits job by REST API(POST /jars/:jarid/run), which means we need to know the host of the any of the current alive jobmanagers. The problem is that, how can we know which job manager is

答复: Best way to find the current alive jobmanager with HA mode zookeeper

2018-07-25 Thread Yuan,Youjun
ent@v1.5> on my client side, to retrieve the leader JM of Flink v1.4 Cluster. Thanks Youjun 发件人: vino yang 发送时间: Wednesday, July 25, 2018 7:11 PM 收件人: Martin Eden 抄送: Yuan,Youjun ; user@flink.apache.org 主题: Re: Best way to find the current alive jobmanager with HA mode zookeeper Hi Martin,

jobmanager holds too many CLOSE_WAIT connection to datanode

2018-08-23 Thread Yuan,Youjun
Hi, After running for a while , my job manager holds thousands of CLOSE_WAIT TCP connection to HDFS datanode, the number is growing up slowly, and it's likely will hit the max open file limit. My jobs checkpoint to HDFS every minute. If I run lsof -i -a -p $JMPID, I can get a tons of following o

答复: jobmanager holds too many CLOSE_WAIT connection to datanode

2018-08-24 Thread Yuan,Youjun
One more safer approach is to execute cancel with savepoint on all jobs first >> this sounds great! Thanks Youjun 发件人: vino yang 发送时间: Friday, August 24, 2018 2:43 PM 收件人: Yuan,Youjun ; user 主题: Re: jobmanager holds too many CLOSE_WAIT connection to datanode Hi Youjun, You can see if

Conversion to relational algebra failed to preserve datatypes

2018-09-14 Thread Yuan,Youjun
Hi, I am getting the following error while submitting job to a cluster, which seems failed to compare 2 RelDateTypes, though they seems identical (from the error message), and everything is OK if I run it locally. I guess calcite failed to compare the first field named ts, of type TIMESTAMP(3),

答复: Conversion to relational algebra failed to preserve datatypes

2018-09-14 Thread Yuan,Youjun
[1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming.html#time-attributes Am 14.09.18 um 10:49 schrieb Yuan,Youjun: Hi, I am getting the following error while submitting job to a cluster, which seems failed to compare 2 RelDateTypes, though they seems identical (from