Re:Re: Re: Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-02 Thread sunfulin
message, it seems that this issue[1] is similar with >yours, but it seems that current compile util does not have this issue. > >BTW, do you using 1.10? > >[1] https://issues.apache.org/jira/browse/FLINK-7490 > >sunfulin 于2020年3月2日周一 上午11:17写道: > >> >> >&g

Error handler strategy in Flink Kafka connector with json format

2020-03-08 Thread sunfulin
hi , community, I am wondering if there is some config params with error handler strategy as [1] refers when defining a Kafka stream table using Flink SQL DDL. For example, the following `json.parser.failure.strategy' can be set to `silencly skip` that can skip the malformed dirty data proces

Re:Re: Error handler strategy in Flink Kafka connector with json format

2020-03-08 Thread sunfulin
yep. Glad to see the progress. Best At 2020-03-09 12:44:05, "Jingsong Li" wrote: Hi Sunfulin, I think this is very important too. There is an issue to fix this[1]. Is that meet your requirement? [1] https://issues.apache.org/jira/browse/FLINK-15396 Best, Jingsong Le

Flink SQL Count Distinct performance optimization

2020-01-07 Thread sunfulin
Hi, community, I'm using Apache Flink SQL to build some of my realtime streaming apps. With one scenario I'm trying to count(distinct deviceID) over about 100GB data set in realtime, and aggregate results with sink to ElasticSearch index. I met a severe performance issue when running my flink jo

Re:Re: Flink SQL Count Distinct performance optimization

2020-01-08 Thread sunfulin
now I am not able to use blink planner on my apps because the current prod environment has not planned or ready to up to Flink 1.9+. At 2020-01-08 15:52:28, "贺小令" wrote: hi sunfulin, you can try with blink planner (since 1.9 +), which optimizes distinct aggregation. you can a

Re:Re: Flink SQL Count Distinct performance optimization

2020-01-08 Thread sunfulin
2020 at 3:53 PM 贺小令 wrote: hi sunfulin, you can try with blink planner (since 1.9 +), which optimizes distinct aggregation. you can also try to enable table.optimizer.distinct-agg.split.enabled if the data is skew. best, godfreyhe sunfulin 于2020年1月8日周三 下午3:39写道: Hi, community, I'm

Re:Re: Re: Flink SQL Count Distinct performance optimization

2020-01-08 Thread sunfulin
wrote: >hi sunfulin, > >As Kurt pointed out, if you use RocksDB state backend, maybe slow disk IO >bound your job. >You can check WindowOperator's latency metric to see how long it tasks to >process an element. >Hope this helps. > >sunfulin 于2020年1月8日周三 下午4:04写道: &g

Null result cannot be used for atomic types

2020-01-09 Thread sunfulin
Hi, I am running a Flink app while reading Kafka records with JSON format. And the connect code is like the following: tableEnv.connect( new Kafka() .version(kafkaInstance.getVersion()) .topic(chooseKafkaTopic(initPack.clusterMode)) .pro

Re:Re: Null result cannot be used for atomic types

2020-01-12 Thread sunfulin
Hi, Thanks for the reply. Tends out that I am using table2datastream and tableEnv.sqlUpdate in the seem time and the exception thus is thrown. My mistake. At 2020-01-10 17:11:02, "Jingsong Li" wrote: Hi sunfulin, Looks like the error is happened in sink instead of sourc

Flink build-in functions

2020-02-03 Thread sunfulin
As far as I can see, the latest flink version does not have a fullfilled support for blink build-in functions. Many date functions and string functions can not be used in Flink. I want to know that when shall we use flink just as to use blink in the same way.

Re:Re: Flink build-in functions

2020-02-05 Thread sunfulin
pport this kind of function. At 2020-02-04 12:35:12, "Jingsong Li" wrote: Hi Sunfulin, Did you use blink-planner? What functions are missing? Best, Jingsong Lee On Tue, Feb 4, 2020 at 12:23 PM Wyatt Chun wrote: They are two different systems for differentiated usage. Fo

Flink DataTypes json parse exception

2020-02-06 Thread sunfulin
Hi, guys When upgrading to Flink 1.10 rc0, I am confused by the new DataTypes schema defination. I am reading and consuming records from kafka with json schema like {"user_id":1866071598998,"recv_time":1547011832729}. and the code snippet is : .withSchema( new Schema() // eve

Flink connect hive with hadoop HA

2020-02-10 Thread sunfulin
Hi, guys I am using Flink 1.10 and test functional cases with hive intergration. Hive with 1.1.0-cdh5.3.0 and with hadoop HA enabled.Running flink job I can see successful connection with hive metastore, but cannot read table data with exception: java.lang.IllegalArgumentException: java.net.Un

Re:Re: Flink connect hive with hadoop HA

2020-02-10 Thread sunfulin
Hi ,guys Thanks for kind reply. Actually I want to know how to change client side haddop conf while using table API within my program. Hope some useful sug. At 2020-02-11 02:42:31, "Bowen Li" wrote: Hi sunfulin, Sounds like you didn't config the hadoop HA correctl

Flink 1.10 es sink exception

2020-02-13 Thread sunfulin
Hi, guys When running the same Flink sql like the following, I met exception like "org.apache.flink.table.api.TableException: UpsertStreamTableSink requires that Table has a full primary keys if it is updated". I am using the latest Flink 1.10 release with blink planner enabled. Because the same

Re:Flink 1.10 es sink exception

2020-02-13 Thread sunfulin
Anyone can share a little advice on the reason of this exception? I changed to use old planner, the same sql runs well. At 2020-02-13 16:07:18, "sunfulin" wrote: Hi, guys When running the same Flink sql like the following, I met exce

Re:Re: Flink 1.10 es sink exception

2020-02-14 Thread sunfulin
entId in ('exposure', 'click') ) as t1 group by aggId, pageId, ts_min I simply run StreamTableEnvironment.sqlUpdate( the above sql content) and execute the task. Not sure what the root cause is. At 2020-02-14 23:19:14, "Jark Wu" wrote: Hi

Re:Re: Flink 1.10 es sink exception

2020-02-16 Thread sunfulin
Hi, WOW,really thankful for the track and debug of this problem. I can see the constant key issue. Appreciate for your kindly help : ) At 2020-02-15 21:06:58, "Leonard Xu" wrote: Hi, sunfulin I reproduce your case,this should be a bug in extracting unique key from p

Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
Hi, guys I am running Flink 1.10 SQL job with UpsertStreamSink to ElasticSearch. In my sql logic, I am using a UDF like ts2Date to handle date format stream fields. However, when I add the `env.enableCheckpointing(time)`, my job failed to submit and throws exception like following. This is reall

Re:Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
heckpointing? what if you enable checkpointing and not use your udf? and disable checkpointing and use udf? sunfulin 于2020年3月1日周日 下午5:41写道: Hi, guys I am running Flink 1.10 SQL job with UpsertStreamSink to ElasticSearch. In my sql logic, I am using a UDF like ts2Date to handle date format stream

Re:Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
return sf.format(date); } } } At 2020-03-01 18:14:30, "Benchao Li" wrote: Could you show how your UDF `ts2Date` is implemented? sunfulin 于2020年3月1日周日 下午6:05写道: Hi, Benchao, Thanks for the reply. Could you provide us more information? 1. what planner are you using?

Re:Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
'n/a', 'connector.bulk-flush.interval' = '1000', 'format.type' = 'json' ) At 2020-03-01 21:08:08, "Benchao Li" wrote: >The UDF looks good. Could you also paste your DDL? Then we can produce your >bug easily. >

Re:Re: Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
#x27;connector.username' = 'admin', 'connector.password' = 'Windows7', 'connector.lookup.cache.max-rows' = '8000', 'connector.lookup.cache.ttl' = '30min', 'connector.lookup.max-retries' = '3' ) At 202