Re: CoProcess() VS union.Process()

2018-02-09 Thread Xingcan Cui
Hi Max, if I understood correctly, instead of joining three streams, you actually performed two separate joins, say S1 JOIN S3 and S2 JOIN S3, right? Your plan "(S1 UNION S2) JOIN S3” seems to be identical with “(S1 JOIN S3) UNION (S2 JOIN S3)” and if that’s what you need, your pipeline should

CoProcess() VS union.Process()

2018-02-09 Thread m@xi
Hello Flinkers, I would like to discuss with you about something that bothers me. So, I have two streams that I want to join along with a third stream which I want to consult its data from time to time and triggers decisions. Essentially, this boils down to coProcessing 3 streams together instead

Share Spring Application context among operators

2018-02-09 Thread Swapnil Dharane
Hello, Is there any way with which I can pass my spring ApplicationContext object as parameter to flink operators? I understand I need to serialize this object.Is there any existing serialization mechanism that I can use? Thanks in advance.

Optimizing multiple aggregate queries on a CEP using Flink

2018-02-09 Thread Sahil Arora
Hi there, We have been working on a project with the title "Optimizing Multiple Aggregate Queries over a Complex Event Processing Engine". The aim is to optimize a group of queries. Take such as* "how many cars passed the post in the past 1 minute" *and* "how many cars passed the post in the past 2

Re: Regarding BucketingSink

2018-02-09 Thread Vishal Santoshi
without --allowNonRestoredState, on a suspend/resume we do see the length file along with the finalized file ( finalized during resume ) -rw-r--r-- 3 root hadoop 10 2018-02-09 13:57 /vishal/sessionscid/dt=2018-02-09/_part-0-28.valid-length that does makes much more sense. I guess we sh

Re: Regarding BucketingSink

2018-02-09 Thread Vishal Santoshi
This is 1.4 BTW. I am not sure that I am reading this correctly but the lifecycle of cancel/resume is 2 steps 1. Cancel job with SP closeCurrentPartFile https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors

Regarding BucketingSink

2018-02-09 Thread Vishal Santoshi
What should be the behavior of BucketingSink vis a vis state ( pending , inprogess and finalization ) when we suspend and resume ? So I did this * I had a pipe writing to hdfs suspend and resume using --allowNonRestoredState as in I had added a harmless MapOperator ( stateless ). * I see that

Re: dataset sort

2018-02-09 Thread Fabian Hueske
The reason why this isn't working in Flink are that * a file can only be written by a single process * Flink does not support merging of sorted network partitions but reads round-robin from incoming network channels. I think if you sort the historic data in parallel (without range partitioning, i

Re: [EXTERNAL] Re: Flink REST API

2018-02-09 Thread Raja . Aravapalli
I am issuing a GET call to list running jobs on Flink Session. Another quick question, is there a way to check the port on which my Flink YARN Session is exposing REST API ? Because, I could figure out on UI either in YARN Resource Manager / Flink Web UI of YARN Session the port number I receiv

Re: [EXTERNAL] Re: Flink REST API

2018-02-09 Thread Gary Yao
Hi Raja, Can you tell me the API call that you are trying to issue? If it is not a GET request, it could be that you are suffering from this bug: https://issues.apache.org/jira/browse/YARN-2031 In my case the tracking url shown on the resource manager ui is indeed one that targets the YARN pro

Re: dataset sort

2018-02-09 Thread david westwood
Thanks. I have to stream in the historical data and its out-of-boundedness >> real-time data. I thought there was some elegant way using mapPartition that I wasn't seeing. On Fri, Feb 9, 2018 at 5:10 AM, Fabian Hueske wrote: > You can also partition by range and sort and write each partition. O

Re: [EXTERNAL] Re: Flink REST API

2018-02-09 Thread Raja . Aravapalli
Hi Gary, Thanks a lot. I am able to use REST API now. As you informed, I am able to query REST API, by capturing the tracking-url, I get by using the command “yarn application -list” But, however as I observe in the YARN Resource manager UI, I am not able to query using the tracking url I am

RE: CEP for time series in csv-file

2018-02-09 Thread Esa Heikkinen
Hi Thanks for the hints, but I am still very interested about simple working example with combination: sbt-project, scala, csv-file reading and cep processing. I have did not exactly find something like that. It would help me a lot. It takes lot of time to learn and test many possible code com

Re: Classloader leak with Kafka Mbeans / JMX Reporter

2018-02-09 Thread Edward
I applied the change in the pull request associated with that Kafka bug, and unfortunately it didn't resolve anything. It doesn't unregister any additional MBeans which are created by Kafka's JmxRepository -- it is just a fix to the remove mbeans from Kafka's cache of mbeans (i.e. it is doing clean

Re: Strata San Jose

2018-02-09 Thread ashish pok
Awesome, I will send a note from my work email :)  -- Ashish On Fri, Feb 9, 2018 at 5:12 AM, Fabian Hueske wrote: Hi Ashish, I'll be at Strata San Jose and give two talks. Just ping me and we can meet there :-) Cheers, Fabian 2018-02-09 0:53 GMT+01:00 ashish pok : Wondering if any of

Re: No Job is getting Submitted through Flink 4.0 UI

2018-02-09 Thread Puneet Kinra
I am unable to submit the job in flink from UI any specific port opening is required. On Fri, Feb 9, 2018 at 5:10 PM, Puneet Kinra < puneet.ki...@customercentria.com> wrote: > I am unable to submit the job in flink from UI > > -- > *Cheers * > > *Puneet Kinra* > > *Mobile:+918800167808 | Skype

No Job is getting Submitted through Flink 4.0 UI

2018-02-09 Thread Puneet Kinra
I am unable to submit the job in flink from UI -- *Cheers * *Puneet Kinra* *Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com * *e-mail :puneet.ki...@customercentria.com *

Re: Strata San Jose

2018-02-09 Thread Fabian Hueske
Hi Ashish, I'll be at Strata San Jose and give two talks. Just ping me and we can meet there :-) Cheers, Fabian 2018-02-09 0:53 GMT+01:00 ashish pok : > Wondering if any of the core Flink team members are planning to be at the > conference? It would be great to meet in peson. > > Thanks, > > -

Re: dataset sort

2018-02-09 Thread Fabian Hueske
You can also partition by range and sort and write each partition. Once all partitions have been written to files, you can concatenate the files. As Till said it is not possible to sort in parallel and write in order to a single file. Best, Fabian 2018-02-09 10:35 GMT+01:00 Till Rohrmann : > Hi

Re: Batch Cascade application

2018-02-09 Thread Puneet Kinra
Hi Till I have 2 kind of data a) read the data from database put into the memory and nosql database so have 1 source & custom sink operator Job1 -->Source--->NoSQL Sink-->status b) once the data is updated into the memory i need to run the second job so i am checking the status return by th

Re: Batch Cascade application

2018-02-09 Thread Till Rohrmann
Hi Puneet, without more information about the job you're running (ideally code), it's hard to help you. Cheers, Till On Fri, Feb 9, 2018 at 10:43 AM, Puneet Kinra < puneet.ki...@customercentria.com> wrote: > Hi > > I am working on batch application i which once the data is get loaded into > the

Batch Cascade application

2018-02-09 Thread Puneet Kinra
Hi I am working on batch application i which once the data is get loaded into the Memory second job should only run once first job is finished. boolean contactHistoryLoading=bonusPointBatch.contactHistoryLoading(jsonFileReader,cache); if(contactHistoryLoading) { bonusPointBatch.transcationLoadi

Re: dataset sort

2018-02-09 Thread Till Rohrmann
Hi David, Flink only supports sorting within partitions. Thus, if you want to write out a globally sorted dataset you should set the parallelism to 1 which effectively results in a single partition. Decreasing the parallelism of an operator will cause the individual partitions to lose its sort ord

Re: Developing and running Flink applications in Linux through Windows editors or IDE's ?

2018-02-09 Thread Till Rohrmann
Hi Esa, welcome to the community :-). For the development of Flink it does not really matter how you code. In general, contributors pick what suits their needs best and so should you. Here is a link for general remarks for setting up IntelliJ and Eclipse [1]. [1] https://ci.apache.org/projects/fl

Re: Two issues when deploying Flink on DC/OS

2018-02-09 Thread Till Rohrmann
Hi, "java.io.IOException: Connection reset by peer" is usually thrown if the remote peer terminates the connection. So the interesting bit would be who's requesting static files from Flink. So far we serve the web frontend and the log and stdout files via the StaticFileServerHandler. Maybe it's DC