Re: Get Tumbling Window Top-K using SQL

2020-03-01 Thread Jark Wu
Hi Weizheng, You are right. You can use the TopN feature in blink planner. But note that it doesn't support tumbling window topn, it is a topn without windowing and event-time. But you can achieve it by PARTITIONED BY , the column could be a preprocessed column which represents which window does

Question about runtime filter

2020-03-01 Thread faaron zheng
Hi, everyone These days, I am trying to implement runtime filter in flink1.10 with flink-sql-benchmark according to blink. I mainly change three part of flink code: add runtime filter rule; modify the code gen and bloomfilter; add some aggregatedaccumulator methods according to accumulator. Now,

Re: Question about runtime filter

2020-03-01 Thread JingsongLee
Hi, Does runtime filter probe side wait for building runtime filter? Can you check the start time of build side and probe side? Best, Jingsong Lee -- From:faaron zheng Send Time:2020年3月2日(星期一) 14:55 To:user Subject:Question abou

Re: How JobManager and TaskManager find each other?

2020-03-01 Thread KristoffSC
Thanks about clarification for NAT, Moving NAT issue aside for a moment", Is the process of sending "task deployment descriptor" that you mentioned in "Feb 26, 2020; 4:18pm" a specially the process of notifying TaskManager about IP of participating TaskManagers in job described somewhere? I'm fam

Re: Flink remote batch execution in dynamic cluster

2020-03-01 Thread Antonio Martínez Carratalá
Thank you Piotrek, I will check those options, I only have a standalone cluster so any option would need a set up. On Fri, Feb 28, 2020 at 2:12 PM Piotr Nowojski wrote: > Hi, > > I guess it depends what do you have already available in your cluster and > try to use that. Running Flink in existin

Re: [Question] enable end2end Kafka exactly once processing

2020-03-01 Thread Arvid Heise
Hi Eleanore, the flink runner is maintained by the Beam developers, so it's best to ask on their user list. The documentation is, however, very clear. "Flink runner is one of the runners whose checkpoint semantics are not compatible with current implementation (hope to provide a solution in near

Schema registry deserialization: Kryo NPE error

2020-03-01 Thread Nitish Pant
Hi all, I am trying to work with flink to get avro data from kafka for which the schemas are stored in kafka schema registry. Since, the producer for kafka is a totally different service(an MQTT consumer sinked to kafka), I can’t have the schema with me at the consumer end. I read around and di

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread Benchao Li
Hi kant, CSV format is an independent module, you need to add it as your dependency. org.apache.flink flink-csv ${flink.version} kant kodali 于2020年3月1日周日 下午3:43写道: > > The program finished with the following exception: >

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread godfrey he
hi kant, > Also why do I need to convert to DataStream to print the rows of a table? Why not have a print method in the Table itself? Flink 1.10 introduces a utility class named TableUtils to convert a Table to List, this utility class is mainly used for demonstration or testing and is only applic

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread kant kodali
The dependency was already there. Below is my build.gradle. Also I checked the kafka version and looks like the jar flinkShadowJar "org.apache.flink:flink-connector-kafka_2.11:${flinkVersion}" downloads kafka-clients version 2.2.0. So I changed my code to version 2.2.0 and same problem persists.

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread godfrey he
I think you should use `flink-sql-connector-kafka*-0.11*_2.11` instead of `flink-connector-kafka_2.11`. Bests, Godfrey kant kodali 于2020年3月1日周日 下午5:15写道: > The dependency was already there. Below is my build.gradle. Also I checked > the kafka version and looks like the jar > > flinkShadowJar "o

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread Benchao Li
I don't know how gradle works, but in Maven, packaging dependencies into one fat jar needs to specify how SPI property files should be dealt with, like Could you check that your final jar contains correct resource file? godfrey he 于2020年3月1日周日 下午5:25写道: > I think you should use `flink-s

Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
Hi, guys I am running Flink 1.10 SQL job with UpsertStreamSink to ElasticSearch. In my sql logic, I am using a UDF like ts2Date to handle date format stream fields. However, when I add the `env.enableCheckpointing(time)`, my job failed to submit and throws exception like following. This is reall

Get Tumbling Window Top-K using SQL

2020-03-01 Thread Lu Weizheng
Hi, I find a question on StackOverflow(https://stackoverflow.com/questions/49191326/flink-stream-sql-order-by) about how to get Top-K using Flink SQL, it was written by Fabian. It was backed in 2018. The main idea is using a RANK to get the Top K of filed 'a': SELECT a, b, c FROM ( SELECT

Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread Benchao Li
Hi fulin, It seems like a bug in the code generation. Could you provide us more information? 1. what planner are you using? blink or legacy planner? 2. how do you register your UDF? 3. does this has a relation with checkpointing? what if you enable checkpointing and not use your udf? and disable

Re:Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
Hi, Benchao, Thanks for the reply. Could you provide us more information? 1. what planner are you using? blink or legacy planner? I am using Blink Planner. Not test with legacy planner because my program depend a lot of new feature based on blink planner. 2. how do you register your UDF? Just u

Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread Benchao Li
Could you show how your UDF `ts2Date` is implemented? sunfulin 于2020年3月1日周日 下午6:05写道: > Hi, Benchao, > Thanks for the reply. > > Could you provide us more information? > 1. what planner are you using? blink or legacy planner? > I am using Blink Planner. Not test with legacy planner because my pr

Re:Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
Below is the code. The function trans origin field timeStr "2020-03-01 12:01:00.234" to target timeStr accroding to dayTag. public class ts2Date extends ScalarFunction { public ts2Date() { } public String eval (String timeStr, boolean dayTag) { if(timeStr == null) {

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread kant kodali
Hi Benchao, That worked! Pasting the build.gradle file here. However this only works for 0.11 and it needs zookeeper.connect() which shouldn't be required. not sure why it is required in Flink Kafka connector? If I change the version to 2.2 in the code and specify this jar flinkShadowJar "org.ap

Writing a DataSet to ElasticSearch

2020-03-01 Thread Niels Basjes
Hi, I have a job in Flink 1.10.0 which creates data that I need to write to ElasticSearch. Because it really is a Batch (and doing it as a stream keeps giving OOM problems: big + unordered + groupby) I'm trying to do it as a real batch. To write a DataSet to some output (that is not a file) an Ou

Re: Giving useful names to the SQL steps/operators.

2020-03-01 Thread Niels Basjes
Thanks. On Sat, Feb 29, 2020 at 4:20 PM Yuval Itzchakov wrote: > > Unfortunately, it isn't possible. You can't set names to steps like > ordinary Java/Scala functions. > > On Sat, 29 Feb 2020, 17:11 Niels Basjes, wrote: > >> Hi, >> >> I'm playing around with the streaming SQL engine in combinat

Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread Benchao Li
The UDF looks good. Could you also paste your DDL? Then we can produce your bug easily. sunfulin 于2020年3月1日周日 下午6:39写道: > Below is the code. The function trans origin field timeStr "2020-03-01 > 12:01:00.234" to target timeStr accroding to dayTag. > > *public class *ts2Date *extends *ScalarFunct

Re:Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
CREATE TABLE realtime_product_sell ( sor_pty_id varchar, entrust_date varchar, entrust_time varchar, product_code varchar , business_type varchar , balance double , cust_name varchar , open_comp_name varchar , open_comp_id varchar , org_name varchar , org_id varchar , com

Kubernetes Error Code

2020-03-01 Thread Samir Tusharbhai Chauhan
Hi, Does anyone knows what is below error code? Our Flink pod got restarted and we see below when we do edit pod. [cid:image001.png@01D5F01E.53D91B40] Warm Regards, Samir Chauhan There's a reason we support Fair Dealing. YOU. This email and any files transmitted with it or attached to it (t

Re: Single stream, two sinks

2020-03-01 Thread miki haiat
So you have rabitmq source and http sink? If so you can use side output in order to dump your data to db. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html On Sat, Feb 29, 2020, 23:01 Gadi Katsovich wrote: > Hi, > I'm new to flink and am evaluating it to replace

Re: Exceptions in Web UI do not appear in logs

2020-03-01 Thread orips
Hi, It's version 1.5.2. I actually found the place in the code responsible for it. In the "catch" block, it doesn't log the error and it lets it propagate. https://github.com/apache/flink/blob/62839e88e15b338a8af9afcef698c38a194c592f/flink-connectors/flink-connector-kinesis/src/main/java/org/apa

[ANNOUNCE] Weekly Community Update 2020/09

2020-03-01 Thread Konstantin Knauf
Dear community, happy to share this week's community update. It was a relatively quiet week on the dev@ mailing list (mostly votes on previously covered FLIPs), but there is always something to share. Additionally, I have decided to also feature *flink-packages.org *in

Re: Do I need to use Table API or DataStream API to construct sources?

2020-03-01 Thread kant kodali
* What went wrong: Could not determine the dependencies of task ':shadowJar'. > Could not resolve all dependencies for configuration ':flinkShadowJar'. > Could not find org.apache.flink:flink-sql-connector-kafka_2.11:universal. Searched in the following locations: - https://repo.mave

[Question] enable end2end Kafka exactly once processing

2020-03-01 Thread Jin Yi
Hi experts, My application is using Apache Beam and with Flink to be the runner. My source and sink are kafka topics, and I am using KafkaIO connector provided by Apache Beam to consume and publish. I am reading through Beam's java doc: https://beam.apache.org/releases/javadoc/2.16.0/org/apache/b

Re: Operator latency metric not working in 1.9.1

2020-03-01 Thread Rafi Aroch
Hi Ori, Make sure that latency metrics is enabled. It's disabled by default. See also that the scope is set properly. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#latency-tracking https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html#metric

Re: Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread Benchao Li
Could you also provide us the DDL for lscsp_sc_order_all and dim_app_cust_info ? sunfulin 于2020年3月1日周日 下午9:22写道: > > *CREATE TABLE **realtime_product_sell *( > sor_pty_id *varchar*, > entrust_date *varchar*, > entrust_time *varchar*, > product_code *varchar *, > business_type *varchar

Re: Kubernetes Error Code

2020-03-01 Thread Yang Wang
Hi Samir, I assume that you are running Flink standalone session/per-job cluster on Kubernetes. Since both of them start the JM/TM process in the foreground. So you could use `kubectl logs {pod_name}` to get the logs. Only the exit code is not enough to find the root cause. Best, Yang Samir Tus

Re: [Native Kubernetes] [Ceph S3] Unable to load credentials from service endpoint.

2020-03-01 Thread Yang Wang
Hi Niels, You are right. The S3 related configurations you have set in your `main()` is only applicable in the client side. Since the filesystem is initialized in the entrypoint of JM/TM for only once. AFAIK, we could not provide different credentials for each job in the same session cluster. Bes

Re: Timeout error in ZooKeeper

2020-03-01 Thread Yang Wang
Hi Samir. It seems that your zookeeper connection timeout is set to 3000ms. And it did not connect to server for 14305ms, maybe due to full gc or network problem. When it reconnected, the "ConnectionLossException" will be thrown. So have you ever change the zookeeper client related timeout confi

回复: Get Tumbling Window Top-K using SQL

2020-03-01 Thread Lu Weizheng
Sorry guys, I find solution on wiki about Top-N using Blink planner. SELECT [column_list] FROM ( SELECT [column_list], ROW_NUMBER() OVER ([PARTITION BY col1[, col2...]] ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum FROM table_name) WHERE rownum <= N [AND conditions]

Re: Providing hdfs name node IP for streaming file sink

2020-03-01 Thread Yang Wang
Hi Nick, Certainly you could directly use "namenode:port" as the schema of you HDFS path. Then the hadoop configs(e.g. core-site.xml, hdfs-site.xml) will not be necessary. However, that also means you could benefit from the HDFS high-availability[1]. If your HDFS cluster is HA configured, i stron

Re:Re: Re: Re: Re: Flink SQL job failed to submit with enableCheckpointing while SQL contains UDF

2020-03-01 Thread sunfulin
create table lscsp_sc_order_all ( amount varchar , argType varchar, balance varchar, branchNo varchar , businessType varchar , channelType varchar , counterOrderNo varchar , counterRegisteredDate varchar, custAsset varchar , customerNumber varchar, customerType varc

Re: Is CSV format supported for Kafka in Flink 1.10?

2020-03-01 Thread Jark Wu
Hi Kant, Csv is supported in Kafka, but you should download and load flink-csv sql jar into SQL CLI using `--library`. Because, the Csv format factory is implemented in a separate module and not bundled by default. [1]: https://repo1.maven.org/maven2/org/apache/flink/flink-csv/1.10.0/flink-csv-1.