Re: Flink vs Spark streaming benchmark

2017-12-17 Thread G.S.Vijay Raajaa
s://data-artisans.com/blog/curious-case-broken- > benchmark-revisiting-apache-flink-vs-databricks-runtime > > 2017-11-13 11:44 GMT+01:00 G.S.Vijay Raajaa : > >> Hi Guys, >> >> I have been using Flink for quite sometime now and recently I hit upon a >> benchmark resul

Flink vs Spark streaming benchmark

2017-11-13 Thread G.S.Vijay Raajaa
Hi Guys, I have been using Flink for quite sometime now and recently I hit upon a benchmark result that was published in Data bricks. Would love to hear your thoughts - https://databricks.com/blog/2017/10/11/benchmarking-structured-streaming-on-databricks-runtime-against-state-of-the-art-streamin

Custom sliding window

2017-10-04 Thread G.S.Vijay Raajaa
Hi, I would like to implement a custom time based sliding window . The idea is that the window is of 1 hr size and slides every 5 second. I would like to process the window function after 10 records are accumulated in the window till it reaches 1 hr, post that since it slides every 5 second, the w

Re: AW: Is watermark used by joining two streams

2017-07-31 Thread G.S.Vijay Raajaa
r will be the minimum time of both streams (time of the >>> "slower" watermark). >>> >>> @Vijay: I did not understand what your requirements are. Do you want to >>> join or merge streams? Those are two different things. This thread >>> discusse

Re: AW: Is watermark used by joining two streams

2017-07-31 Thread G.S.Vijay Raajaa
read > discusses joins not merging. > > Best, > Fabian > > 2017-07-31 4:24 GMT+02:00 G.S.Vijay Raajaa : > >> Hi Fabian, >> >> How do I order by the merge time. Let's say I merge the stream at T1. I >> wanted to drop T2 merge if T2 < T1. Now depending o

Re: AW: Is watermark used by joining two streams

2017-07-30 Thread G.S.Vijay Raajaa
Hi Fabian, How do I order by the merge time. Let's say I merge the stream at T1. I wanted to drop T2 merge if T2 < T1. Now depending on the arrival of data from individual stream and the time at which the merge happens, they become out of order. Any thoughts will be really appreciated. Regards, V

Watermarking and Timestamp on Kafka stream union

2017-07-26 Thread G.S.Vijay Raajaa
HI, I am having a union of 3 kafka topic stream, i am joining them by a timestamp field. I would like to order the join by timestamp. How do I assign a watermark and extract timestamp from a union stream? Regards, Vijay Raajaa GS

Re: Purging Late stream data

2017-07-26 Thread G.S.Vijay Raajaa
; > Regards, > > Kien > > > > On 7/26/2017 9:37 AM, G.S.Vijay Raajaa wrote: > >> Hi, >> >> I am having 3 streams which is being merged from a union of kafka topics >> on a given timestamp. The problem I am facing is that, if there is a delay >>

Purging Late stream data

2017-07-25 Thread G.S.Vijay Raajaa
Hi, I am having 3 streams which is being merged from a union of kafka topics on a given timestamp. The problem I am facing is that, if there is a delay in one of the stream and when the data in that particular stream arrives at a later point in time, the merge happens in a delayed fashion. The wa

Re: Flink Jobs disappers

2017-07-07 Thread G.S.Vijay Raajaa
her HA is enabled, does this happen every time etc. . > Regards, > Chesnay > > > On 06.07.2017 21:43, G.S.Vijay Raajaa wrote: > >> HI, >> >> I am using Flink Task manager and Job Manager as docker containers. >> Strangely, I find the jobs to disappear fro

Re: Referencing Global Window across flink jobs

2017-07-07 Thread G.S.Vijay Raajaa
ing > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke > > G.S.Vijay Raajaa schrieb > > > HI, > > I have a use case were I need to build a global window with custom > trigger. I would like to reference this window across my flink jobs. Is > there a possibility that the global window can be referenced? > > Regards, > Vijay Raajaa GS >

Flink Jobs disappers

2017-07-06 Thread G.S.Vijay Raajaa
HI, I am using Flink Task manager and Job Manager as docker containers. Strangely, I find the jobs to disappear from the web portal after some time. The jobs don't move to the failed state either. Any pointers will be really helpful. Not able to get a clue from the logs. Kindly let me know if I n

Referencing Global Window across flink jobs

2017-07-06 Thread G.S.Vijay Raajaa
HI, I have a use case were I need to build a global window with custom trigger. I would like to reference this window across my flink jobs. Is there a possibility that the global window can be referenced? Regards, Vijay Raajaa GS

Re: Window data retention - Appending to previous data and processing

2017-06-27 Thread G.S.Vijay Raajaa
s a timer for your > cleanup time, and purges when the cleanup timer fires > * a ProcessWindowFunction, to so that you always get all the contents of > the window when processing a window > > Best, > Aljoscha > > > On 24. Jun 2017, at 18:37, G.S.Vijay Raajaa > wrote: >

Window data retention - Appending to previous data and processing

2017-06-24 Thread G.S.Vijay Raajaa
Hi , I am trying to implement a flink job which requires a window that keeps on adding data to the previous data in the window. The idea is for every addition of a new stream of record, the subsequent chain till the sink needs to be called. In the next iteration window will have old data + new dat

Default value - Time window expires with no data from source

2017-06-24 Thread G.S.Vijay Raajaa
Hi, I am trying to implement a flink job which takes the twitter as the source and collects tweets from a list of hashtags. The flink job basically aggregates the volume of tweets per hashtag in a given time frame. I have implemented them successfully, but then if there is no tweet across all the

Re: SingleOutputStreamOperator addsink Error

2017-06-08 Thread G.S.Vijay Raajaa
That's right. Jackson and Gson similar naming convention. Thanks for the quick catch. Regards, Vijay Raajaa GS On Fri, Jun 9, 2017 at 1:59 AM, Ted Yu wrote: > bq. new SinkFunction(){ > > Note the case in JsonObject. It should be JSONObject > > FYI > > On Thu, Jun 8,

SingleOutputStreamOperator addsink Error

2017-06-08 Thread G.S.Vijay Raajaa
Hi, I am trying to pass the SingleOutputStreamOperator to a custom sink. I am getting an error while implementing the same. Code snippet: SingleOutputStreamOperator stream = env.addSource(source) .flatMap(new ExtractHashTagsSymbols(tickers)) .keyBy(0) .time

Re: Window Function on AllWindowed Stream - Combining Kafka Topics

2017-05-05 Thread G.S.Vijay Raajaa
as Date object , tried them as String as well. Regards, Vijay Raajaa G S On Fri, May 5, 2017 at 1:59 PM, Aljoscha Krettek wrote: > What’s the KeySelector you’re using? To me, this indicates that the > timestamp field is somehow changing after the original keying or in transit. > > Bes

Re: Window Function on AllWindowed Stream - Combining Kafka Topics

2017-05-04 Thread G.S.Vijay Raajaa
org.apache.flink.runtime.taskmanager.Task.run(Task.java:655) at java.lang.Thread.run(Thread.java:745) Regards, Vijay Raajaa GS On Wed, May 3, 2017 at 3:50 PM, G.S.Vijay Raajaa wrote: > Thanks for your input, will try to incorporate them in my implementation. > > Regards, > Vijay Raajaa G S > > On Wed, May 3, 2017 at

Re: Window Function on AllWindowed Stream - Combining Kafka Topics

2017-05-03 Thread G.S.Vijay Raajaa
nd emit a result when > you get the event from the other side. You also set a cleanup timer in case > no other event arrives to make sure that state eventually goes away. > > Best, > Aljoscha > > On 3. May 2017, at 11:47, G.S.Vijay Raajaa > wrote: > > Sure. Thanks for

Re: Window Function on AllWindowed Stream - Combining Kafka Topics

2017-05-03 Thread G.S.Vijay Raajaa
keying is “lost” again > because you apply a flatMap(). That’s why you have an all-window and not a > keyed window. > > Best, > Aljoscha > > On 2. May 2017, at 09:20, G.S.Vijay Raajaa > wrote: > > Hi, > > I am trying to combine two kafka topics using the a single

Window Function on AllWindowed Stream - Combining Kafka Topics

2017-05-02 Thread G.S.Vijay Raajaa
Hi, I am trying to combine two kafka topics using the a single kafka consumer on a list of topics, further convert the json string in the stream to POJO. Then, join them via keyBy ( On event time field ) and to merge them as a single fat json, I was planning to use a window stream and apply a wind

REST API call in stream transformation

2017-04-27 Thread G.S.Vijay Raajaa
HI, I have just started to explore Flink and have couple of questions. I am wondering if its possible to call a rest endpoint asynchronously and pipe the response to the next state of my transformation on the stream. The idea is such that after charging my data in a predefined time window, I woul