Re: How to use Lo-level Joins API

2019-08-06 Thread Yuta Morisawa
Hi Victor > Is it possible stream1 and stream2 don't have common keys? You may verify this by logging out the key of current processed element. I misunderstood the usage. I thought stream1 and 2 have different contexts and they can access both state stores each other. But actually, processEl

Configure Prometheus Exporter

2019-08-06 Thread Chaoran Yu
Hello guys, Does anyone know if the Prometheus metrics exported via the JMX reporter or the Prometheus reporter can be configured using a YAML file similar to this one ? If there is such support in Flink, how do I

Re: How to use Lo-level Joins API

2019-08-06 Thread Victor Wong
Hi Yuta, > I made sure the 'ValueState data' has data from stream1 with the IDE debugger but in spite of that, processElement2 can't access it. Since `processElement1` and `processElement2` use the same `Context`, I think there is no state access issue. Is it possible stream1 and stream2 don't

Re: How to use Lo-level Joins API

2019-08-06 Thread Yuta Morisawa
Hi Yun Thank you for replying. >Have you set a default value for the state ? Actually, the constructor of the ValueStateDescriptor with default value is deprecated so I don't set it. The problem occurs when the stream1 comes first. I made sure the 'ValueState data' has data from stream

Re: How to use Lo-level Joins API

2019-08-06 Thread Yun Gao
Hi Yuta, Have you set a default value for the state ? If the state did not have a default value and the records from stream2 comes first for a specific key, then the state would never be set with a value, thus the return value will be null. Best, Yun

How to use Lo-level Joins API

2019-08-06 Thread Yuta Morisawa
Hi I am trying to use low-level joins. According to the doc, the way is creating a state and access it from both streams, but I can't. https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/process_function.html This is a snippet of my code. It seems that the process

Re: From Kafka Stream to Flink

2019-08-06 Thread Maatary Okouya
Fabian, ultimately, i just want to perform a join on the last values for each keys. On Tue, Aug 6, 2019 at 8:07 PM Maatary Okouya wrote: > Fabian, > > could you please clarify the following statement: > > However joining an append-only table with this view without adding > temporal join conditi

Re: From Kafka Stream to Flink

2019-08-06 Thread Maatary Okouya
Fabian, could you please clarify the following statement: However joining an append-only table with this view without adding temporal join condition, means that the stream is fully materialized as state. This is because previously emitted results must be updated when the view changes. It really d

Re: From Kafka Stream to Flink

2019-08-06 Thread Maatary Okouya
Thank you for the clarification. Really appreciated. Is Last_val part of the API ? On Fri, Aug 2, 2019 at 10:49 AM Fabian Hueske wrote: > Hi, > > Flink does not distinguish between streams and tables. For the Table API / > SQL, there are only tables that are changing over time, i.e., dynamic >

how to get the code produced by Flink Code Generator

2019-08-06 Thread Vincent Cai
Hi Users, In Spark, we can invoke Dataset method "queryExecution.debug.codegen()" to get the code produced by Catalyst. is there any similar api in Flink? reference link : https://medium.com/virtuslab/spark-sql-under-the-hood-part-i-26077f85ebf0 Regards Vincent Cai

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread zhijiang
Hi paul, In theory broadcast operator could not be chained for all-to-all mode, and chain is only feasible for one-to-one mode like forward. If chain, the next operator could process the raw record emitted by head operator directly. But if not, the emitted record must be serialized into buffer

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Wong Victor
Oops, accidentally sent the email. The good news is that you don’t have to checkpoint the state of the Kafka consumers. From: Wong Victor Date: Tuesday, August 6, 2019 at 11:31 PM To: Piotr Nowojski , 黄兆鹏 Cc: user Subject: Re: Will broadcast stream affect performance because of the absence of

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Wong Victor
Hi, If the performance impact of braking the operator chain is huge, maybe you can read the latest schema from Kafka within the operators. It’s a little complicated, you have to start a Kafka consumer in e.g. ` RichFunction#open()` and reading from (the largest offset – 1), and handle new messa

Re: [DataStream API] Best practice to handle custom session window - time gap based but handles event for ending session

2019-08-06 Thread Jungtaek Lim
Thanks Fabian on providing great input! Regarding your feedback on solution, yes you're right I realized I missed out-of-order events, and as you said we have to "split" existing window into two which current abstraction of custom window couldn't help here. (Flink would have no idea how aggregated

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Mohammad Hosseinian
Hi Oytun, Thanks and good to know about your planned features. BR, Moe On 06/08/2019 16:14, Oytun Tez wrote: Hi Mohammad, Queryable State works in some cases: https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/state/queryable_state.html As much as I know, this is the only

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Oytun Tez
Hi Mohammad, Queryable State works in some cases: https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/state/queryable_state.html As much as I know, this is the only way to access Flink's state from outside, until we have Savepoint API coming in 1.9. --- Oytun Tez *M O T A W O

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Mohammad Hosseinian
Hi Alex, Thanks for your reply. The application is streaming. The issue with using messaging channels for such kind of communication is the 'race condition'. I mean, when you have parallel channels of communication (one for the main flow of your streaming application and one for bringing 'state

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Piotr Nowojski
Hi, No, I think you are right, I forgot about the broadcasting requirement. Piotrek > On 6 Aug 2019, at 13:11, 黄兆鹏 wrote: > > Hi, Piotrek, > I previously considered your first advice(use union record type), but I found > that the schema would be only sent to one subtask of the operator(for >

Re: Restore state class not found exception in 1.8

2019-08-06 Thread Tzu-Li (Gordon) Tai
Hi Lasse, I think the diagnosis here: https://issues.apache.org/jira/browse/FLINK-13159 matches your problem. This problem should be fixed in the next bugfix version for 1.8.x. We'll also try to fix this for the upcoming 1.9.0 as well. Cheers, Gordon On Mon, Jun 3, 2019 at 1:55 PM Lasse Nedergaa

Re: Flink 1.8.1: Seeing {"errors":["Not found."]} when trying to access the Jobmanagers web interface

2019-08-06 Thread Ufuk Celebi
Hey Tobias, out of curiosity: were you using the job/application cluster (as documented here: https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/docker.html#flink-job-cluster )? – Ufuk On Tue, Aug 6, 2019 at 1:50 PM Kaymak, Tobias wrote: > I was using Apache Beam and i

Re: Flink 1.8.1: Seeing {"errors":["Not found."]} when trying to access the Jobmanagers web interface

2019-08-06 Thread Kaymak, Tobias
I was using Apache Beam and in the lib folder I had a JAR that was using Flink 1.7 in its POM. After bumping that to 1.8 it works :) On Tue, Aug 6, 2019 at 11:58 AM Kaymak, Tobias wrote: > It completely works when using the docker image tag 1.7.2 - I just bumped > back and the web interface was

[bug ?] PrometheusPushGatewayReporter register more then one JM

2019-08-06 Thread miki haiat
We have standalone cluster with PrometheusPushGatewayReporter conflagration. its seems like we cant register more then one JM to Prometheus because of naming uniqueness. WARN org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter - There was a problem registering metric numRunni

Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Протченко Алексей
Hi Mohammad, which types of applications do you mean? Streaming or batch ones? In terms of streaming ones queues like Kafka or RabbitMq between applications should be the best way I think.  Best regards, Alex >Вторник, 6 августа 2019, 12:21 +02:00 от Mohammad Hosseinian >: > >Hi all, > >We

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread 黄兆鹏
Hi, Piotrek, I previously considered your first advice(use union record type), but I found that the schema would be only sent to one subtask of the operator(for example, operatorA), and other subtasks of the operator are not aware of it. In this case is there anything I have missed? Thank you

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Piotr Nowojski
Hi, Have you measured the performance impact of braking the operator chain? This is a current limitation of Flink chaining, that if an operator has two inputs, it can be chained to something else (only one input operators are chained together). There are plans for the future to address this iss

Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Mohammad Hosseinian
Hi all, We have a network of Flink applications. The whole cluster receives 'state-update' messages from the outside, and there is one Flink application in our cluster that 'merges' these updates and creates the actual, most up-to-date, state of the 'data-objects' and passes it to the next pro

Re: Flink 1.8.1: Seeing {"errors":["Not found."]} when trying to access the Jobmanagers web interface

2019-08-06 Thread Kaymak, Tobias
It completely works when using the docker image tag 1.7.2 - I just bumped back and the web interface was there. On Tue, Aug 6, 2019 at 10:21 AM Kaymak, Tobias wrote: > Hello, > > after upgrading the docker image from version 1.7.2 to 1.8.1 and wiping > out zookeeper completely I see > > {"error

Re:Pramaters in eclipse with Flink

2019-08-06 Thread Haibo Sun
Hi alaa.abutaha, In fact, your problem is not related to Flink, but how to specify program parameters in Eclipse. I think the following document will help you. https://www.cs.colostate.edu/helpdocs/cmd.pdf Best, Haibo At 2019-07-26 22:02:48, "alaa" wrote: >Hallo > I run this example form

Re: getting an exception

2019-08-06 Thread Avi Levi
Yeap that was it (deploying 1.8.1 over 1.8.0 ) thanks !!! On Mon, Aug 5, 2019 at 5:53 PM Gaël Renoux wrote: > *This Message originated outside your organization.* > -- > Hi Avi and Victor, > > I just opened this ticket on JIRA: > https://issues.apache.org/jira/browse/

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread 黄兆鹏
Hi Piotrek, Thanks for your reply, my broadcast stream just listen to the changes of the schema, and it's very infrequent and very lightweight. In fact there are two ways to solve my problem, the first one is a broadcast stream that listen to the change of the schema, and broadcast to every o

Re: An ArrayIndexOutOfBoundsException after a few message with Flink 1.8.1

2019-08-06 Thread Nicolas Lalevée
Hi Yun, Indeed, that was it: a parallelism set to lower than what my custom partitioner was computing. Thanks Nicolas On Tue, Aug 6, 2019, at 4:47 AM, Yun Gao wrote: > Hi Nicolas: > > Are you using a custom partitioner? If so, you might need to check if the > Partitioners#partition has retu

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Piotr Nowojski
Hi, Broadcasting will brake an operator chain. However my best guess is that Kafka source will be still a performance bottleneck in your job. Also Network exchanges add some measurable overhead only if your records are very lightweight and easy to process (for example if you are using RocksDB t

Flink 1.8.1: Seeing {"errors":["Not found."]} when trying to access the Jobmanagers web interface

2019-08-06 Thread Kaymak, Tobias
Hello, after upgrading the docker image from version 1.7.2 to 1.8.1 and wiping out zookeeper completely I see {"errors":["Not found."]} when trying to access the webinterface of Flink. I can launch jobs from the cmdline and I can't spot any error in the logs (so far on level INFO). I tried addi

Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread 黄兆鹏
Hi all, My flink job has dynamic schema of data, so I want to consume a schema kafka topic and try to broadcast to every operator so that each operator could know what kind of data it is handling. For example, the two streams just like this: OperatorA -> OperatorB -> OperatorC ^

Re:Re: How to write value only using flink's SequenceFileWriter?

2019-08-06 Thread Haibo Sun
Hi Liu Bo, If you haven't customize serializations through the configuration item "io.serializations", the default serializer for Writable objects is org.apache.hadoop.io.serializer.WritableSerialization.WritableSerializer. As you said, when WritableSerializer serialize the NullWritable object