Re: Move files read by flink

2018-03-09 Thread flinkuser101
Is there any way to do that? I have been searching for way to do that but in vain. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink 1.4 SQL API Streaming TableException

2018-03-09 Thread Pavel Ciorba
Bill Lee, Man, you saved me from headbanging :) Thank you! 2018-03-10 0:25 GMT+02:00 杨力 : > To use a field in a table as timestamp, it must be declared as a rowtime > attribute for the table. > > 1) Call env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime). > 2) Call withRowtimeAttribut

Re: Extremely large job serialization produced by union operator

2018-03-09 Thread 杨力
Thank you for your response. It occurs both in a standalone cluster anda a yarn-cluster. I am trying to remove business code and reproduce it with a minimal demo. On Sat, Mar 10, 2018 at 2:27 AM Piotr Nowojski wrote: > Hi, > > Could you provide more details about your queries and setup? Logs cou

Re: Flink 1.4 SQL API Streaming TableException

2018-03-09 Thread 杨力
To use a field in a table as timestamp, it must be declared as a rowtime attribute for the table. 1) Call env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime). 2) Call withRowtimeAttribute on KafkaJsonTableSourceBuilder. Reference: 1. https://ci.apache.org/projects/flink/flink-docs-relea

Re: [DISCUSS] Inverted (child-first) class loading

2018-03-09 Thread Stephan Ewen
@Ken very interesting thought. One for have three options: - forbid duplicate classes - parent first conflict resolution - child first conflict resolution Having number one as the default and let the error message suggest options two and three as options would definitely make users aware of

Implement a sort inside the WindowFunction

2018-03-09 Thread Felipe Gutierrez
Hi all, I have a word count using flink stream and mey reduce transformations is applying a WindowFunction. I would like that this WindowFunction sort the output of the reduce. Is that possible? So I will sort by key the data set inside the window. Thanks for your ideas! Here is my code:

Flink 1.4 SQL API Streaming TableException

2018-03-09 Thread Pavel Ciorba
Hi everyone! I decided to try the Time-windowed join functionality of Flink 1.4+. My SQL query is an exact copy of the example in the documentation, and the program reads and writes from Kafka. I used the example from here: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sq

POJO default constructor - how is it used by Flink?

2018-03-09 Thread NEKRASSOV, ALEXEI
Hi, I added a default constructor to the class that represents messages flowing through my Flink job graph - to satisfy Flink POJO requirements. Although I don't call that default constructor explicitly, the logs show that it is called anyway. Why is this happening? In my test, for 77 incoming

Re: [DISCUSS] Inverted (child-first) class loading

2018-03-09 Thread Ken Krugler
I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here. Basically they prevent a program from running if there are duplicate classes on the classpath. This causes headaches when you really need a different version

Re: Extremely large job serialization produced by union operator

2018-03-09 Thread Piotr Nowojski
Hi, Could you provide more details about your queries and setup? Logs could be helpful as well. Piotrek > On 9 Mar 2018, at 11:00, 杨力 wrote: > > I wrote a flink-sql app with following topography. > > KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink > KafkaJsonTable

PartitionNotFoundException when restarting from checkpoint

2018-03-09 Thread Seth Wiesman
Hi, We are running Flink 1.4.0 with a yarn deployment on ec2 instances, rocks dB and incremental checkpointing, last night a job failed and became stuck in a restart cycle with a PartitionNotFound. We tried restarting the checkpoint on a fresh Flink session with no luck. Looking through the log

Re: Failing to recover once checkpoint fails

2018-03-09 Thread Vishal Santoshi
Yes. We have not hit the snag in 1.4.0 ( our current version ). Again though this occurs under sustained down time on hadoop and it has been more stable lately :) On Wed, Mar 7, 2018 at 4:09 PM, Stephan Ewen wrote: > The assumption in your previous mail is correct. > > Just to double check: >

Re: "Close()" aborts last transaction in TwoPhaseCommitSinkFunction

2018-03-09 Thread Niels van Kaam
Thank you! I already have a custom source function so adding the hacky solution would not be too much effort. Looking forward to the "proper" solution! Niels On Fri, Mar 9, 2018, 16:00 Piotr Nowojski wrote: > Hi, > > Short answer is: no, at the moment clean shutdown is not implemented for > th

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-09 Thread Hequn Cheng
I think it would be easier if we cast eventTs&r_eventTs as TIMESTAMP and do non-window join. Something like: val sql1 = "select distinct id, cast(eventTs as timestamp) as eventTs, > " + > "count(*) over (partition by id order by eventTs rows" + > " between 100 preceding and current

Re: "Close()" aborts last transaction in TwoPhaseCommitSinkFunction

2018-03-09 Thread Piotr Nowojski
Hi, Short answer is: no, at the moment clean shutdown is not implemented for the streaming, but it’s on our to do list for the future. Hacky answer: you could implement some custom code, that would wait for at least one completed checkpoint after the last input data. But that would require mod

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-09 Thread Timo Walther
Another workaround would be to split the query into two Table API parts. You could do the join, convert into a data stream, and convert into table again. The optimizer does not optimize over DataStream API calls. What also should work is to cast your eventTs to TIMESTAMP as early as possible

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-09 Thread Xingcan Cui
Hi Yan, I think you could try that as a workaround. Don’t forget to follow the DataStreamWindowJoin

Re: Dynamic CEP https://issues.apache.org/jira/browse/FLINK-7129?subTaskView=all

2018-03-09 Thread Vishal Santoshi
No worries, thanks for the update! This will be huge for user facing products and seems eminently doable.. On Fri, Mar 9, 2018, 2:09 AM Dawid Wysakowicz wrote: > Hi, > > Kostas is right, unfortunately I had to stop the work, cause we were > missing BroadcastState. I hope I will get back to this

"Close()" aborts last transaction in TwoPhaseCommitSinkFunction

2018-03-09 Thread Niels van Kaam
Hi, I'm working on a custom implementation of a sink which I would like to use with exactly once semantics. Therefore I have implemented the TwoPhaseCommitSinkFunction class as mentioned in this recent post: https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html I

Submiting jobs via UI/Rest API

2018-03-09 Thread eSKa
Hi guys, We were trying to use UI's "Submit new job" functionality (and later REST endpoints for that). There were few problems we found: 1. When we ran job that had additional code done after env execution (or any sink) the code was not executed. E.g. our job was calculating some data, writing it

Re:[DISCUSS] Inverted (child-first) class loading

2018-03-09 Thread mingleizhang
Hi, Stephan It is a great change, keep it and polish it. nice nice nice I think the more fewer NoClassDefFoundError or ClassNotFoundException the user will encounter in the future. But I would like to two question about this functionality. If Im am wrong, please helps me out. Thank you in adv

[DISCUSS] Inverted (child-first) class loading

2018-03-09 Thread Stephan Ewen
Hi all! Flink 1.4 introduces child-first classloading by default, for the application libraries. We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the c

Extremely large job serialization produced by union operator

2018-03-09 Thread 杨力
I wrote a flink-sql app with following topography. KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink ... KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink I have a dozen

Re: checkpoint stuck with rocksdb statebackend and s3 filesystem

2018-03-09 Thread Tony Wei
Hi Stefan, We prepared to run it on local SSDs yesterday. However, as I said, the problem just disappeared. Of course we will replace it to local SSDs, but I'm afraid that I might be able to reproduce the same situation for both environments to compare the difference. Best Regards, Tony Wei. 20

Re: checkpoint stuck with rocksdb statebackend and s3 filesystem

2018-03-09 Thread Stefan Richter
Hi, if processing and checkpointing are stuck in RocksDB, this could indeed hint to a problem with your IO system. The comment about using EBS could be important, as it might be a bad idea from a performance point of view to run RocksDB on EBS; did you ever compare against running it on local S

Re: Event time join

2018-03-09 Thread Gytis Žilinskas
Thanks for the answers and discussion both of you. The FLIP mentions that the cases where one stream is much faster than the other one, will not be handled for now either, so I guess it would still not solve our problems. As for the join semantics itself, I think we achieve the same thing with CoP

Re: Emulate Tumbling window in Event Time Space

2018-03-09 Thread Piotr Nowojski
Hi, As Xingcan responded, you could use already built in operator for that. If you really want to implement something on your own (need custom feature? For fun?), you would have to implement some variation of a InternalTimerService from Flink (you can browse the code for an inspiration). On ea