Re: CEP issue

2018-02-01 Thread Vishal Santoshi
I have flink master CEP library code imported to a 1.4 build. On Thu, Feb 1, 2018 at 5:33 PM, Vishal Santoshi wrote: > A new one > > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInter

Re: CEP issue

2018-02-01 Thread Vishal Santoshi
A new one java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) a

Re: CEP issue

2018-02-01 Thread Vishal Santoshi
It happens when it looks to throw an exception and calls shardBuffer.toString. b'coz of the check int id = sharedBuffer.entryId; Preconditions.checkState(id != -1, "Could not find id for entry: " + *sharedBuffer*); On Thu, Feb 1, 2018 at 5:09 PM, Vishal Santoshi wrote: > The watermark ha

Re: CEP issue

2018-02-01 Thread Vishal Santoshi
The watermark has not moved for this pattern to succeed ( or other wise ), the issue though is that it is pretty early in the pipe ( like within a minute ). I am replaying from a kafka topic but the keyed operator has emitted no more than 1500 plus elements to SelectCEPOperator ( very visible on t

CEP issue

2018-02-01 Thread Vishal Santoshi
This is a pretty simple pattern, as in I hardly have 1500 elements ( across 600 keys at the max ) put in and though I have a pretty wide range , as in I am looking at a relaxed pattern ( like 40 true conditions in 6 hours ), I get this. I have the EventTime turned on. java.lang.OutOfMemoryErro

Flink REST API

2018-02-01 Thread Raja . Aravapalli
Hi, I have a triggered a Flink YARN Session on Hadoop yarn. While I was able to trigger applications and run them. I wish to find the URL of REST API for the Flink YARN Sesssion I launched. Can someone please help me point out on how to find the REST API Url for the Flink on YARN? Thanks a l

Flink on AWS EMR - how to use flink-log4j configuration?

2018-02-01 Thread Ishwara Varnasi
I didn't find an example of flink-log4j configuration while creating EMR cluster for running Flink. What should be passed to "flink-log4j" config? Actual log4j config or path to file? Also, how to see application logs in EMR? thanks Ishwara Varnasi

Re: Flink on K8s job submission best practices

2018-02-01 Thread Christophe Jolif
Hi Maximilian, Coming back on this as we have similar challenges. I was leaning towards 3. But then I read you and figured I might have missed something ;) We agree 3 is not idiomatic and creates a "detached job" but in a lack of a proper solution I can live with that. We also agree there is no

Get the JobID when Flink job fails

2018-02-01 Thread Vinay Patil
Hi, When the Flink job executes successfully I get the jobID, however when the Flink job fails the jobID is not returned. How do I get the jobId in this case ? Do I need to call /joboverview REST api to get the job ID by looking for the Job Name ? Regards, Vinay Patil

Extending Flink Slots when running on Yarn

2018-02-01 Thread Julio Biason
Hi, I'm wodering, is there a way to make Flink understand that I added a new worker machine on yarn to increase the tasks slots? Right now, we are just exploring how to run Flink on Yarn and, so far, we managed to create a small Hadoop/Yarn cluster (3 machines, 1 master and 2 workers) and start F

RE: S3 for state backend in Flink 1.4.0

2018-02-01 Thread Edward Rojas
Hi Hayden, It seems like a good alternative. But I see it's intended to work with spark, did you manage to get it working with Flink ? I some tests but I get several errors when trying to create a file, either for checkpointing or saving data. Thanks in advance, Regards, Edward -- Sent from:

How to handle multiple yarn sessions and choose at runtime the one to submit a ha streaming job ?

2018-02-01 Thread LINZ, Arnaud
Hello, I am using Flink 1.3.2 and I'm struggling to achieve something that should be simple. For isolation reasons, I want to start multiple long living yarn session containers (with the same user) and choose at run-time, when I start a HA streaming app, which container will hold it. I start m

Latest version of Kafka

2018-02-01 Thread Marchant, Hayden
What is the newest version of Kafka that is compatible with Flink 1.4.0? I see the last version of Kafka supported is 0.11 , from documentation, but has any testing been done with Kafka 1.0? Hayden Marchant

RE: Multiple Elasticsearch sinks not working in Flink

2018-02-01 Thread Teena Kappen // BPRISE
@Fabian: I will run the code with the Git repo source and let you know the results. @Stephan: Sorry I missed the email from you somehow. I understand from the JIRA link that you already have the answer for this. Yet I tried using two separate config map objects in my code and that resolved the

Re: How does BucketingSink generate a SUCCESS file when a directory is finished

2018-02-01 Thread Kien Truong
Hi, I did not actually test this, but I think with Flink 1.4 you can extend BucketingSink and overwrite the invoke method to access the watermark Pseudo code: invoke(IN value, SinkFunction.Context context) {    long currentWatermark = context.watermark()    long taskIndex = getRuntimeContex

Reading bounded data from Kafka in Flink job

2018-02-01 Thread Marchant, Hayden
I have 2 datasets that I need to join together in a Flink batch job. One of the datasets needs to be created dynamically by completely 'draining' a Kafka topic in an offset range (start and end), and create a file containing all messages in that range. I know that in Flink streaming I can specif

Re: Maintain heavy hitters in Flink application

2018-02-01 Thread Timo Walther
Hi, I think it would be easier to implement a custom key selector and introduce some artifical key that spreads the load more evenly. This would also allow you to use keyed state. You could use a ProcessFunction and set timers to define the "every now and then". Keyed state would also ease th

Re: How to enable “upsert mode” for dynamic tables?

2018-02-01 Thread Fabian Hueske
Hi Austin, thanks for your questions. I posted an answer on Stack Overflow. Let me know if you have further questions or comments. Thanks, Fabian 2018-02-01 6:08 GMT+01:00 Puneet Kinra : > As of now flink doesnt support this feature few days i came across the > same requirement.. > > On Thu, F

Re: Managed operator state treating state of all parallel operators as the same

2018-02-01 Thread m@xi
Hello Gyula & Stefan, Below I attach a similar situation that I am trying to resolve, [1] I am also using *managed operator state*, but I have some trouble with the flink documentation. I believe it is not that clear. So, I have the following questions: 1 -- Can I concatenate all the partial sta

Re: Maintain heavy hitters in Flink application

2018-02-01 Thread m@xi
Anyone, someone, somebody? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

RE: S3 for state backend in Flink 1.4.0

2018-02-01 Thread Marchant, Hayden
Edward, We are using Object Storage for checkpointing. I'd like to point out that we were seeing performance problems using the S3 protocol. Btw, we had quite a few problems using the flink-s3-fs-hadoop jar with Object Storage and had to do some ugly hacking to get it working all over. We recen

queryable state API

2018-02-01 Thread Maciek Próchniak
Hello, Currently (1.4) to be able to use queryable state client has to know ip of (working) task manager and port. This is a bit awkward - as it forces external services to know details of flink cluster. Event more complex when we define port range for queryable state proxy and we're not sure