Re: How to set a custom JAVA_HOME when run flink on YARN?

2016-08-25 Thread Renkai
It seems that this config variant only effect local cluster and stand alone cluster,not effect yarn. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-set-a-custom-JAVA-HOME-when-run-flink-on-YARN-tp8676p8709.html Sent from the Apache Fl

Re: Handling Kafka DeserializationSchema() exceptions

2016-08-25 Thread Yassine Marzougui
Good to know that there is already a JIRA issue, Thanks! On Thu, Aug 25, 2016 at 8:58 PM, Robert Metzger wrote: > Hi Yassine, > > there's an ongoing discussion about the issue in this JIRA: > https://issues.apache.org/jira/browse/FLINK-3679. > Emitting null is not an option. > There are workarou

Re: Handling Kafka DeserializationSchema() exceptions

2016-08-25 Thread Robert Metzger
Hi Yassine, there's an ongoing discussion about the issue in this JIRA: https://issues.apache.org/jira/browse/FLINK-3679. Emitting null is not an option. There are workarounds to the issue, but I think they are all not nice. On Thu, Aug 25, 2016 at 8:05 PM, Yassine Marzougui wrote: > Hi all, >

Enabling Encryption between slaves in Flink

2016-08-25 Thread Vinay Patil
Hi, I have a requirement that all the data flowing between the task managers should be encrypted, is there a way in Flink to do that. Can we use the configuration file to enable this as follows : http://doc.akka.io/docs/akka/snapshot/scala/remoting.html#Remoting_Sample or do we need to add the a

Re: Dealing with Multiple sinks in Flink

2016-08-25 Thread Robert Metzger
I would first try to understand the metrics system locally, for example using the VisualVM tool, that also allows you to access JMX-exported metrics. Once you've seen how it works, you can look into remote JMX access. There's a page in the Flink documentation about the metrics. On Thu, Aug 25, 20

Re: Dealing with Multiple sinks in Flink

2016-08-25 Thread vinay patil
Thanks Robert will try that, can you provide more details on how to integrate that. I had sysouts to check if the watermarks are getting generated, I am getting the values, but as I said the window is not getting triggered for parallelism greater than 1, I have tried using AscendingTimeStampExtra

Re: Dealing with Multiple sinks in Flink

2016-08-25 Thread Robert Metzger
Flink 1.1.1 has a metric for exposing the low watermark of each operator. Maybe you can access a TaskManager via JMX to see the value of the WM. Watermarks are sometimes a bit tricky. On Thu, Aug 25, 2016 at 4:29 PM, vinay patil wrote: > Hi Max, > > Here is the code for Timestamp assigner and w

Re: Setting number of TaskManagers

2016-08-25 Thread Robert Metzger
Hi Craig, For the YARN session, you have to pass the the number of taskManagers using the -n argument. if you need to use a n environment variable, you can create a custom script calling the yarn-session.sh script and passing the value of the env variable to the script. Regards, Robert On Wed

Kafka and Flink's partitions

2016-08-25 Thread rss rss
Hello, I want to implement something like a schema of processing which is presented on following diagram. This is calculation of number of unique users per specified time with assumption that we have > 100k events per second and > 100M unique users: I have one Kafka's topic of events with a

Handling Kafka DeserializationSchema() exceptions

2016-08-25 Thread Yassine Marzougui
Hi all, Is there a way to handle hafka deserialization exceptions, when a JSON message is malformed for example? I thought about extending the DeserializationSchema to emit a null or any other value, but that may cause an NPE when using a subsequent TimestampExtractor. The other solution would be

Re: Flink long-running YARN configuration

2016-08-25 Thread Stephan Ewen
Hi Craig! For YARN sessions, Flink will - (a) register the app master hostname/port/etc at Yarn, so you can get them from example from the yarn UI and tools - (b) it will create a .yarn-properties file that contain the hostname/ports info. Future calls to the command line pick up the info from

Re: Setting up zeppelin with flink

2016-08-25 Thread Trevor Grant
I'm glad you were able to work it out! Your setup is somewhat unique, and as Zeppelin is the result of multiple drive-by commits, interesting and unexpected things happen in the tail cases. Could you please report your problem and solution on the Zeppelin user list? What you've discovered may in

Re: Running multiple jobs on yarn (without yarn-session)

2016-08-25 Thread Maximilian Michels
Thanks Niels, actually I also created one :) We will fix this on the master and for the 1.1.2 release. On Thu, Aug 25, 2016 at 5:14 PM, Niels Basjes wrote: > I have this with a pretty recent version of the source version (not a > release). > > Would be great if you see a way to fix this. > I cons

Re: Running multiple jobs on yarn (without yarn-session)

2016-08-25 Thread Niels Basjes
I have this with a pretty recent version of the source version (not a release). Would be great if you see a way to fix this. I consider it fine if this requires an extra call to the system indicating that this is a 'mulitple job' situation. I created https://issues.apache.org/jira/browse/FLINK-44

Re: Delaying starting the jobmanager in yarn?

2016-08-25 Thread Niels Basjes
Sounds good. Is there a basic example somewhere I can have a look at? Niels On Thu, Aug 25, 2016 at 2:55 PM, Maximilian Michels wrote: > Hi Niels, > > If you're using 1.1.1, then you can instantiate the > YarnClusterDescriptor and supply it with the Flink jar and > configuration and subsequentl

Flink long-running YARN configuration

2016-08-25 Thread Foster, Craig
I'm trying to understand Flink YARN configuration. The flink-conf.yaml file is supposedly the way to configure Flink, except when you launch Flink using YARN since that's determined for the AM. The following is contradictory or not completely clear: "The system will use the configuration in co

Re: Elasticsearch connector and number of shards

2016-08-25 Thread Flavio Pompermaier
I've just added a JIRA improvement ticket for this ( https://issues.apache.org/jira/browse/FLINK-4491). Best, Flavio On Wed, Jul 20, 2016 at 4:21 PM, Maximilian Michels wrote: > The connector doesn't cover this use case. Through the API you need to > use the IndicesAdminClient: > https://www.el

Re: Dealing with Multiple sinks in Flink

2016-08-25 Thread vinay patil
Hi Max, Here is the code for Timestamp assigner and watermark generation. PFA Regards, Vinay Patil On Thu, Aug 25, 2016 at 7:39 AM, Maximilian Michels [via Apache Flink User Mailing List archive.] wrote: > I'm assuming there is something wrong with your Watermark/Timestamp > assigner. Could yo

Re: Setting up zeppelin with flink

2016-08-25 Thread Frank Dekervel
Hello, Sorry for the spam, but i got it working after copying all scala libraries from another interpreter to the interpreter/flink directory. so i think the error is the scala libraries are missing from the binary release in the zeppelin/interpreters/flink/ directory. For now i'm adding the copy

Re: Running multiple jobs on yarn (without yarn-session)

2016-08-25 Thread Maximilian Michels
Hi Niels, This is with 1.1.1? We could fix this in the upcoming 1.1.2 release by only using automatic shut down for detached jobs. In all other cases we should be able to shutdown from the client side after running all jobs. The only downside I see is that Flink clusters may actually never be shut

Re: Setting up zeppelin with flink

2016-08-25 Thread Frank Dekervel
Hello, For reference, below is the dockerfile i used to build the zeppelin image (basically just openjdk 8 with the latest binary release of zeppelin) the "docker-entrypoint.sh" script is just starting zeppelin.sh (oneliner) FROM openjdk:alpine RUN apk add --no-cache bash snappy ARG ZEPPELIN_VE

Re: Setting up zeppelin with flink

2016-08-25 Thread Frank Dekervel
Hello Trevor, Thanks for your suggestion. The log does not explain a lot: on the flink side i don't see anything at all, on the zeppelin side i see this: Your suggestion sounds plausible, as i always start zeppelin, and then change the configuration from local to remote.. however, port 6123 locall

Re: Delaying starting the jobmanager in yarn?

2016-08-25 Thread Maximilian Michels
Hi Niels, If you're using 1.1.1, then you can instantiate the YarnClusterDescriptor and supply it with the Flink jar and configuration and subsequently call `deploy()` on it to receive a ClusterClient for Yarn which you can submit programs using the `run(PackagedProgram program, String args)` meth

Re: "Failed to retrieve JobManager address" in Flink 1.1.1 with JM HA

2016-08-25 Thread Hironori Ogibayashi
Max, Thank you for the fix! Regards, Hironori 2016-08-24 18:37 GMT+09:00 Maximilian Michels : > Hi Hironori, > > That's what I thought. So it won't be an issue for most users who do > not comment out the JobManager url from the config. Still, the > information printed is not correct. The issue h

Re: Dealing with Multiple sinks in Flink

2016-08-25 Thread Maximilian Michels
I'm assuming there is something wrong with your Watermark/Timestamp assigner. Could you share some of the code? On Wed, Aug 24, 2016 at 9:54 PM, vinay patil wrote: > Hi, > > Just an update, the window is not getting triggered when I change the > parallelism to more than 1. > > Can you please expl

Re: Regarding Global Configuration in Flink

2016-08-25 Thread Maximilian Michels
Hi! Are you referring to the GlobalConfiguration class? That used to be a singleton class in Flink version < 1.2.x which would load the configuration only once per VM, if it found a config file. It allowed operations that could change that config after it had been loaded. It has since then been re

Running multiple jobs on yarn (without yarn-session)

2016-08-25 Thread Niels Basjes
Hi, I created a small application that needs to run multiple (batch) jobs on Yarn and then terminate. In this case I'm exporting data from a list of HBase tables I essentially do right now the following: flink run -m yarn-cluster -yn 10 bla.jar ... And in my main I do foreach thing I need to

Re: How to set a custom JAVA_HOME when run flink on YARN?

2016-08-25 Thread Maximilian Michels
Preferably, you set that directly in the config using env.java.home: /path/to/java/home If unset, Flink will use the $JAVA_HOME environment variable. Cheers, Max On Thu, Aug 25, 2016 at 10:39 AM, Renkai wrote: > I think I solved myself,just add -yD yarn.taskmanager.env.JAVA_HOME=xx in > the c

Re: How to set a custom JAVA_HOME when run flink on YARN?

2016-08-25 Thread Renkai
I think I solved myself,just add -yD yarn.taskmanager.env.JAVA_HOME=xx in the command line, a little hard to find the solution though. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-set-a-custom-JAVA-HOME-when-run-flink-on-YARN-tp867

Delaying starting the jobmanager in yarn?

2016-08-25 Thread Niels Basjes
Hi, We have a situation where we need to start a flink batch job on a yarn cluster the moment an event arrives over a queue. These events occur at a very low rate (like once or twice a week). The idea we have is to run an application that listens to the queue and executes the batch when it receiv