Perform processing only when watermark updates, buffer data otherwise

2020-04-01 Thread Manas Kale
Hi, I want to perform some processing on events only when the watermark is updated. Otherwise, for all other events, I want to keep buffering them till the watermark arrives. The main motivation behind doing this is that I have several operators that emit events/messages to a downstream operator. S

Re: End to End Latency Tracking in flink

2020-04-01 Thread zoudan
Hi, I think we may add latency metric for each operator, which can reflect consumption ability of each operator. Best, Dan Zou > 在 2020年3月30日,18:19,Guanghui Zhang 写道: > > Hi. > At flink source connector, you can send $source_current_time - $event_time > metric. > In the meantime, at flink si

Re: Savepoint Location from Flink REST API

2020-04-01 Thread Aaron Langford
All, it looks like the actual return structure from the API is: 1. Success > { > "status": { > "id": "completed" > }, > *"operation"*: { > "location": "string" > } > } 2. Failure > { > "status": { > "id": "completed" > }, > *"operation"*: { > "failure-cause": { >

Re: Log file environment variable 'log.file' is not set.

2020-04-01 Thread Vitaliy Semochkin
Hello Robert, Thank you for the help, *tried to access method org.apache.hadoop.yarn.client.**ConfiguredRMFailoverProxyProvi* was the root of the issue. I managed to fix issue with the "hadoop version is 2.4.1", it was because previous flink version I was using referred to org.apache.flink:flin

Re: Latency tracking together with broadcast state can cause job failure

2020-04-01 Thread Lasse Nedergaard
HiI have attached a simple project with a test that reproduce the problem. The normal fault is a mixed string but you can also EOF exception. Please let me know if you have any questions to the solution. Med venlig hilsen / Best regardsLasse Nedergaard Telematics2-feature-flink-1.10-latency-tracki

Re: flink-shaded-hadoop2 for flink 1.10

2020-04-01 Thread Vitaliy Semochkin
Thank you very much Chesnary! On Tue, Mar 31, 2020 at 1:32 AM Chesnay Schepler wrote: > flink-shaded-hadoop2 was released as part of Flink until 1.8 (hence why it > followed the Flink version scheme), after which it was renamed to > flink-shaded-hadoop-2 and is now being released separately from

Re: Flink in EMR configuration problem

2020-04-01 Thread Piotr Nowojski
Hi, Sorry I missed that. But yes, it looks like you are running two JobManagers :) You can always check the yarn logs for more information what is being executed. Piotrek > On 1 Apr 2020, at 16:44, Antonio Martínez Carratalá > wrote: > > Hi Piotr, > > I don't have 2 task managers, just one

Re: Flink in EMR configuration problem

2020-04-01 Thread Antonio Martínez Carratalá
Hi Piotr, I don't have 2 task managers, just one with 2 slots. That would be ok according to my calculations, but as Craig said I need one more instance for the cluster master. I was guessing the job manager was running in the master and the task manager in the slave, but both job manager and task

Re: [Third-party Tool] Flink memory calculator

2020-04-01 Thread Yangze Guo
@Marta Thanks for the tip! I'll do that. Best, Yangze Guo On Wed, Apr 1, 2020 at 8:05 PM Marta Paes Moreira wrote: > > Hey, Yangze. > > I'd like to suggest that you submit this tool to Flink Community Pages [1]. > That way it can get more exposure and it'll be easier for users to find it. > > T

Re: [Third-party Tool] Flink memory calculator

2020-04-01 Thread Marta Paes Moreira
Hey, Yangze. I'd like to suggest that you submit this tool to Flink Community Pages [1]. That way it can get more exposure and it'll be easier for users to find it. Thanks for your contribution! [1] https://flink-packages.org/ On Tue, Mar 31, 2020 at 9:09 AM Yangze Guo wrote: > Hi, there. > >

Re: State & Generics

2020-04-01 Thread Aljoscha Krettek
Hi Laurent! On 31.03.20 10:43, Laurent Exsteens wrote: Yesterday I managed to find another solution: create the type information outside of the class and pass it to the constructor. I can retrieve the type information from DataStream.getType() (whiich. This works well, and is acceptable in my ca

Re: [DISCUSS] Change default planner to blink planner in 1.11

2020-04-01 Thread Aljoscha Krettek
+1 to making Blink the default planner, we definitely don't want to maintain two planners for much longer. Best, Aljoscha

Re: some subtask taking too long

2020-04-01 Thread Piotr Nowojski
Hey, The thread you are referring to is about DataStream API job and long checkpointing issue. While from your message it seems like you are using Table API (SQL) to process a batch data? Or what exactly do you mean by: > i notice that there are one or two subtasks that take too long to finish

Re: Flink Kafka Consumer Throughput reduces over time

2020-04-01 Thread Piotr Nowojski
Hi, I haven’t heard about Flink specific problem like that. Have you checked that the records are not changing over time? That they are not for example twice as large or twice as heavy to process? Especially that you are using rate limiter with 12MB/s. If your records grew to 60KB in size, that

Re: Flink in EMR configuration problem

2020-04-01 Thread Piotr Nowojski
Hey, Isn’t explanation of the problem in the logs that you posted? Not enough memory? You have 2 EMR nodes, 8GB memory each, while trying to allocate 2 TaskManagers AND 1 JobManager with 6GB heap size each? Piotrek > On 31 Mar 2020, at 17:01, Antonio Martínez Carratalá > wrote: > > Hi, I'm

Re: flink 1.10 support LONG as watermark?

2020-04-01 Thread Jingsong Li
Hi jingjing, If seconds precision is OK for you. You can try "to_timestamp(from_unixtime(your_time_seconds_long))". Best, Jingsong Lee On Wed, Apr 1, 2020 at 8:56 AM jingjing bai wrote: > Thanks a lot! > > Jark Wu 于2020年4月1日周三 上午1:13写道: > >> Hi Jing, >> >> I created https://issues.apache.org/

Re: Latency tracking together with broadcast state can cause job failure

2020-04-01 Thread Yun Tang
Hi Lasse Never meet this problem before, but can you share some exception stack trace so that we could take a look. The simple project to reproduce is also a good choice. Best Yun Tang From: Lasse Nedergaard Sent: Tuesday, March 31, 2020 19:10 To: user Subject