Re: Flink Kafka cannot find org/I0Itec/zkclient/serialize/ZkSerializer

2015-07-24 Thread Wendong
Below is the build.sbt I am using (also include project/assembly.sbt) : //- Start build.sbt --- version := "1.0" scalaVersion := "2.10.4" libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.0", "org.apache.flink" % "flink-clients" % "0.9.0")

Re: Yarn configuration

2015-07-24 Thread Robert Metzger
Hi Michele, configuring a YARN cluster to allocate all available resources as good as possible is sometimes tricky, that is true. We are aware of these problems and there are actually the following two JIRAs for this: https://issues.apache.org/jira/browse/FLINK-937 (Change the YARN Client to alloc

Yarn configuration

2015-07-24 Thread Michele Bertoni
Hi everybody, i need a help on how to configure a yarn cluster I tried a lot of conf but none of them was correct We have a cluster on amazon emr let's say 1manager+5worker all of them are m3.2xlarge then 8 core each and 30 GB of RAM each What is a good configuration for such cluster? I would l

Re: starting flink job from bash script with maven

2015-07-24 Thread Stephan Ewen
Thanks for letting us know! The problem with Java Serialization is that they often swallow exceptions and you only see a "corrupted byte stream" in the end. So far, I have found no workaround for that. Stephan On Fri, Jul 24, 2015 at 11:31 AM, Stefano Bortoli wrote: > It seems there is a prob

Re: starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
It seems there is a problem with the maven class loading. I have created the uberjar and then executed with traditional java -cp uberjar.jar args and it worked with no problems. It could be interesting to investigate the reason, as maven exec is very convenient. However, with the uberjar the proble

Re: starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
HI Stephan, I think I may have found a possible root of the problem. I do not build the fat jar, I simply execute the main with maven exec:java with default install and compile. No uberjar created shading. I will try that and report. The fact that it runs in eclipse so easily makes it confusing so

Re: Flink Kafka cannot find org/I0Itec/zkclient/serialize/ZkSerializer

2015-07-24 Thread Robert Metzger
Can you share your full sbt build file with me? I'm trying to reproduce the issue, but I have never used sbt before. I was able to configure the assembly plugin, but the produced fat jar didn't contain the zkclient. Maybe your full sbt build file would help me to identify the issue faster. Let me

Re: Flink Kafka cannot find org/I0Itec/zkclient/serialize/ZkSerializer

2015-07-24 Thread Stephan Ewen
Wendong, Sorry to hear that you are having such trouble with the example. We are using the Kafka connector with many people, building the examples with Maven. It works without any problems. Maybe SBT is just not handling these dependencies correctly, or the SBT script defines the dependencies in

Re: starting flink job from bash script with maven

2015-07-24 Thread Stephan Ewen
Hi! There is probably something going wrong in MongoOutputFormat or MongoHadoop2OutputFormat. Something fails, but Java swallows the problem during Serialization. It may be a classloading issue that gets not reported. Are the MongoOutputFormat and the MongoHadoop2OutputFormat both in the fat jar?

Re: starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
I have implemented this test without any exception: package org.tagcloud.persistence.batch.test; import java.io.IOException; import org.apache.commons.lang.SerializationUtils; import org.apache.hadoop.mapreduce.Job; import org.tagcloud.persistence.batch.MongoHadoop2OutputFormat; import com.mong

Re: starting flink job from bash script with maven

2015-07-24 Thread Stephan Ewen
Hi! The user code object (the output format here) has a corrupt serialization routine. We use default Java Serialization for these objects. Either the MongoHadoopOutputFormat cannot be serialized and swallows an exception, or it overrides the readObject() / writeObject() methods (from Java Serial

starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
Hi guys! I could program a data maintenance job using Flink on MongoDB. The job runs smoothly if I start it from eclipse. However, when I try to run it using a bash script invoking a maven exec:java I have a serialization exception: org.apache.flink.runtime.client.JobExecutionException: Cannot ini