Re: Question about configuring Rich Functions

2017-10-13 Thread Tony Wei
Hi Steve, I think the discussion in this thread [1] could answer your questions. Best Regards, Tony Wei [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/RichMapFunction-parameters-in-the-Streaming-API-td16121.html Steve Jerman 於 2017年10月14日 週六,上午12:41寫道: > This document:

Re: Flink 1.3.2 Netty Exception

2017-10-13 Thread Flavio Pompermaier
Any update on this? Do you want me to create a JIRA issue for this bug? On 11 Oct 2017 17:14, "Ufuk Celebi" wrote: @Chesnay: Recycling of network resources happens after the tasks go into state FINISHED. Since we are submitting new jobs in a local loop here it can easily happen that the new job

problem scale up Flink on YARN

2017-10-13 Thread Lei Chen
Hi, We're trying to implement some module to help autoscale our pipeline which is built with Flink on YARN. According to the document, the suggested procedure seems to be: 1. cancel job with savepoint 2. start new job with increased YARN TM number and parallelism. However, step 2 always gave er

Question about configuring Rich Functions

2017-10-13 Thread Steve Jerman
This document: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/best_practices.html#parsing-command-line-arguments-and-passing-them-around-in-your-flink-application Apache Flink 1.3 Documentation: Best Practices

Re: Regression for dataStream.rescale method from 1.2.1 to 1.3.2

2017-10-13 Thread Till Rohrmann
Hi Antoine, this looks like a regression to me. I'll investigate how this could happen and let you know once I find something. Cheers, Till On Fri, Oct 13, 2017 at 10:16 AM, Antoine Philippot < antoine.philip...@teads.tv> wrote: > Hi, > > After migrating our project from flink 1.2.1 to flink 1.

Re: Timer coalescing necessary?

2017-10-13 Thread Aljoscha Krettek
Hi, Because of how they are triggered by the watermark, all event-time triggers with the same timestamp will be triggered in the same go, without interleaving other calls. Same is true for processing-time triggers because they "piggy back" on the one "physical" processing-time service trigger.

Re: PartitionNotFoundException when running in yarn-session.

2017-10-13 Thread Niels Basjes
Hi I did some tests and it turns out I was really overloading the cluster which caused the problems. I tried the timeout setting but that didn't help. Simply 'not overloading' the system did help. Thanks. Niels On Thu, Oct 12, 2017 at 10:42 AM, Ufuk Celebi wrote: > Hey Niels, > > Flink curre

Re: Timer coalescing necessary?

2017-10-13 Thread Kien Truong
Hi, Thanks for the explanation. Because timer callback and normal execution are not guarantee to be concurrent-safe, if we have multiple timers with the same timestamp, are all of them run before the normal execution resume or are they interleaved with normal execution? Also may I ask how o

Re: Writing an Integration test for flink-metrics

2017-10-13 Thread Martin Eden
Hi, Not merged in yet but this is an example pr that is mocking metrics and checking they are properly updated: https://github.com/apache/flink/pull/4725 On Fri, Oct 13, 2017 at 1:49 PM, Aljoscha Krettek wrote: > I think we could add this functionality to the (operator) test harnesses. > I.e. a

Re: Writing an Integration test for flink-metrics

2017-10-13 Thread Aljoscha Krettek
I think we could add this functionality to the (operator) test harnesses. I.e. add a mock MetricGroup thingy in there that you can query to check the state of metrics. > On 13. Oct 2017, at 13:50, Chesnay Schepler wrote: > > I meant that you could unit-test the behavior of the function in iso

Re: Timer coalescing necessary?

2017-10-13 Thread Aljoscha Krettek
Hi, This is slightly different for processing-time and event-time triggers. First, event-time triggers: there are two data structures, a PriorityQueue (which is implemented as a heap) of timers that is sorted by timestamp, a set of registered timers that is used for deduplication. When adding a

Re: Submitting a job via command line

2017-10-13 Thread Piotr Nowojski
Good to hear that :) > On 13 Oct 2017, at 14:40, Alexander Smirnov wrote: > > Thank you so much, it helped! > > From: Piotr Nowojski > > Date: Thursday, October 12, 2017 at 6:00 PM > To: Alexander Smirnov mailto:asmir...@five9.com>> > Cc: "user@flink.apache.org

Re: Submitting a job via command line

2017-10-13 Thread Alexander Smirnov
Thank you so much, it helped! From: Piotr Nowojski mailto:pi...@data-artisans.com>> Date: Thursday, October 12, 2017 at 6:00 PM To: Alexander Smirnov mailto:asmir...@five9.com>> Cc: "user@flink.apache.org" mailto:user@flink.apache.org>> Subject: Re: Submitting a job

Re: Timer coalescing necessary?

2017-10-13 Thread Kien Truong
Hi Aljoscha, Could you clarify how the timer system works right now ? For example, let's say I have a function F, with 3 keys that are registered to execute at processing time T. Would Flink maintain a single internal timer at time T, then run the callback on all 3 keys when it's triggered ?

Re: Writing an Integration test for flink-metrics

2017-10-13 Thread Chesnay Schepler
I meant that you could unit-test the behavior of the function in isolation. You could create a dummy metric group that verifies that the correct counters are being registered (based on names i guess), as well as provide access to them. Mock some input and observe whether the counter value is bei

Re: Timer coalescing necessary?

2017-10-13 Thread Aljoscha Krettek
Hi, If you have multiple timers per key, then coalescing can make sense to reduce the burden on the timer system. Coalescing them across different keys would not be possible right now. Best, Aljoscha > On 13. Oct 2017, at 06:37, Kien Truong wrote: > > Hi, > > We are having a streaming job w

Re: Beam Application run on cluster setup (Kafka+Flink)

2017-10-13 Thread Shankara
Hi Piotrek, I was checking in Job manager machine logs, and dashboard. But actually output string was recorded in taskmanager macine log file. I added InfluxDB and verified, Received data is writing into influxDB. Thank you very much for your support. Thanks, Shankara -- Sent from: ht

Fwd: Decouple Kafka partitions and Flink parallelism for ordered streams

2017-10-13 Thread Sanne de Roever
Hi Chesnay, /** Fowarding this to group, I mistakingly replied to you directly previously, apologies */ The side output option works in combination with setting slot sharing groups. For reference I have included a source file. The job takes three slots. One slot for input handling, and one slot f

Re: Writing an Integration test for flink-metrics

2017-10-13 Thread Piotr Nowojski
For testing Link applications in general you can read https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/testing.html However as we said before, testing metrics would require using custom o

Re: Beam Application run on cluster setup (Kafka+Flink)

2017-10-13 Thread Piotr Nowojski
Hi, What version of Flink are you using. In earlier 1.3.x releases there were some bugs in Kafka Consumer code. Could you change the log level in Flink to debug? Did you check the Kafka logs for some hint maybe? I guess that metrics like bytes read/input records of this Link application are not

Regression for dataStream.rescale method from 1.2.1 to 1.3.2

2017-10-13 Thread Antoine Philippot
Hi, After migrating our project from flink 1.2.1 to flink 1.3.2, we noticed a big performance drop due to a bad vertices balancing between task manager. In our use case, we set the default parallelism to the number of task managers : val stream: DataStream[Array[Byte]] = env.addSource(new Flink