Backoff strategies for async IO functions?

2019-03-07 Thread William Saar
Hi, Is there a way to specify an exponential backoff strategy for when async function calls fail? I have an async function that does web requests to a rate-limited API. Can you handle that with settings on the async function call? Thanks, William

Flink 1.6.4 signing key file in docker-flink repo?

2019-02-25 Thread William Saar
Trying to build a new Docker image by replacing 1.6.3 with 1.6.4 in the Dockerfile found here ( https://github.com/docker-flink/docker-flink ), but seems to require a new signing key, Is it available somewhere? Getting + wget -nv -O flink.tgz.asc https://www.apache.org/dist/flink/flink-1.6.4/fli

What async Scala HTTP client do people use with Flink async functions?

2019-02-05 Thread William Saar
essing I'm not the only one doing async functions in Scala so figured I'd ask :) Thanks, William Saar

Re: How to add caching to async function?

2019-02-05 Thread William Saar
: "Lasse Nedergaard" To:"William Saar" Cc:"Fabian Hueske" , "user" Sent:Tue, 5 Feb 2019 10:41:41 +0100 Subject:Re: How to add caching to async function? Hi William No iterations isn’t the solution as you can (will) end up in a deadlock. We conclude

Re: How to add caching to async function?

2019-02-05 Thread William Saar
Thanks! Looks like iterations is indeed the way to go for now then... - Original Message - From: "Lasse Nedergaard" To:"Fabian Hueske" Cc:"William Saar" , "user" Sent:Mon, 4 Feb 2019 20:20:30 +0100 Subject:Re: How to add caching to async f

How to add caching to async function?

2019-02-04 Thread William Saar
Hi, I am trying to implement an async function that looks up a value in a cache or, if the value doesn't exist in the cache, queries a web service, but I'm having trouble creating the cache. I've tried to create a RichAsyncFunction and add a map state as cache, but I'm getting: State is not suppor

Use s3 on Flink on kubernetes

2018-12-20 Thread William Saar
How can I easiest use s3 from a Flink job deployed in a session cluster on kubernetes? I've tried including the flink-s3-fs-hadoop dependency in the sbt file for my job, can I programmatically set the properties to point to it? Is there a ready-made docker image for a flink with s3 dependencies

Can't list logs or stdout through web console on Flink 1.7 Kubernetes

2018-12-19 Thread William Saar
I'm running Flink 1.7 in ECS, is this a known issue or should I create a jira? The web console doesn't show anything when trying to list logs or stdout for task managers and the job manager log have stack traces for the errors 2018-12-19 15:35:53,498 ERROR org.apache.flink.runtime.rest.handler.

Re: Generating processing time watermarks in idle event time kafka streams?

2018-12-14 Thread William Saar
Thanks, works great! This should be very useful for real-time dashboard that want to compute in event time, especially for multi-tenant systems or other specialized kafka topics that can have gaps in the traffic. - Original Message - From: "Aljoscha Krettek" To:"Wi

Generating processing time watermarks in idle event time kafka streams?

2018-12-14 Thread William Saar
Any standardized components to generate watermarks based on processing time in an event time stream when there is no data from a source? The docs for event time [1] indicate that people are doing this, but the only suggestion on Stack Overflow [2] is to make every window operator in stream have a

Re: Far too few watermarks getting generated with Kafka source

2018-01-18 Thread William Saar
data used to come from two different topics). William - Original Message - From: "Gary Yao" To: "William Saar" Cc: "user" Sent: Thu, 18 Jan 2018 11:11:17 +0100 Subject: Re: Far too few watermarks getting generated with Kafka source Hi William, How often

Far too few watermarks getting generated with Kafka source

2018-01-17 Thread William Saar
Hi, I have a job where we read data from either Kafka or a file (for testing), decode the entries and flat map them into events, and then add a timestamp and watermark assigner to the events in a later operation. This seems to generate periodic watermarks when running from a file, but when Kafka is

Timestamps and watermarks in CoProcessFunction function

2018-01-16 Thread William Saar
Hi, I have added the code below to the start of processElement2 in CoProcessFunction. It prints timestamps and watermarks for the first 3 elements for each new watermark. Shouldn't the timestamp always be lower than the next watermark? The 3 timestamps before the last watermark are all larger than

Re: Access to time in aggregation, or aggregation in ProcessWindowFunction?

2017-06-20 Thread William Saar
pending on the details. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/ windows.html#triggers [2] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/ process_function.html On Tuesday, 20 June 2017 12:52:38 CEST William Saar wrote: > Hi, > I am

Access to time in aggregation, or aggregation in ProcessWindowFunction?

2017-06-20 Thread William Saar
Hi, I am looking to implement a window that sends out updates for each new event it receives and also when an expiration timer fires and purges the window (the expiration time can be determined from a timestamp in the first event). I can't figure out a way to do this that does not require preserv

Re: Porting batch percentile computation to streaming window

2017-05-30 Thread William Saar
Flink. - Original Message - From: "Gyula Fóra" To: "William Saar" , Cc: Sent: Tue, 30 May 2017 13:56:08 + Subject: Re: Porting batch percentile computation to streaming window I think you could actually do a window operation to get the tDigestStre

Re: Porting batch percentile computation to streaming window

2017-05-30 Thread William Saar
ark information or something on metrics objects in line 1 and emit T digests more often in line 2? Finally, how do I access the watermark/window information in my fold operation in line 1? Thanks! - Original Message - From: "Gyula Fóra" To:"William Saar" , Cc: Sent:Tue, 30

Porting batch percentile computation to streaming window

2017-05-29 Thread William Saar
I am porting a calculation from Spark batches that uses broadcast variables to compute percentiles from metrics and curious for tips on doing this with Flink streaming. I have a windowed computation where I am compute metrics for IP-addresses (a windowed stream of metrics objects grouped by IP-add

Re: ContinuousEventTimeTrigger breaks coGrouped windowed streams?

2016-11-24 Thread William Saar
the end of the window + allowedLateness you have set. On Tue, Nov 22, 2016 at 11:08 PM, William Saar wrote: > Thanks! > One difference is that my topology had 2 sources. I have updated your > example to also use 2 sources and that breaks the co-group operation in the > e

Re: ContinuousEventTimeTrigger breaks coGrouped windowed streams?

2016-11-22 Thread William Saar
6849f405 Gyula is right, the ContinuousEventTimeTrigger never purges the window but that you can circumvent that by extending this trigger and purging at the end of the window, similarly as done in the EventTimeTrigger. -Max On Mon, Nov 21, 2016 at 6:52 PM, William Saar wr

Re: ContinuousEventTimeTrigger breaks coGrouped windowed streams?

2016-11-21 Thread William Saar
d windowed streams? Hi William, I am wondering whether the ContinuousEventTimeTrigger is the best choice here (it never cleans up the state as far as I know). Have you tried the simple SlidingEventTimeWindows as your window function? Cheers, Gyula William Saar ezt írta (időpont: 2016. nov. 19.,

ContinuousEventTimeTrigger breaks coGrouped windowed streams?

2016-11-19 Thread William Saar
Hi! My topology below seems to work when I comment out all the lines with ContinuousEventTimeTrigger, but prints nothing when the line is in there. Can I coGroup two large time windows that use a different trigger time than the window size? (even if the ContinuousEventTimeTrigger doesn't

Working with data locality in streaming using groupBy?

2015-06-05 Thread William Saar
Hi, How can I maintain a local state, for instance a ConcurrentHashMap, across different steps in a streaming chain on a single machine/process? Static variable? (This doesn't seem to work well when running locally as it gets shared across multiple instances, a common "pipeline" store would be h