Re: Flink on EC"

Thomas Götzinger Sun, 08 Nov 2015 10:53:37 -0800

Sorry for Confusing,

the flink cluster throws following stack trace..


org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Failed to submit job 29a2a25d49aa0706588ccdb8b7e81c6c (Flink 
Java Job at Sun Nov 08 18:50:52 UTC 2015)
        at org.apache.flink.client.program.Client.run(Client.java:413)
        at org.apache.flink.client.program.Client.run(Client.java:356)
        at org.apache.flink.client.program.Client.run(Client.java:349)
        at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63)
        at 
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:789)
        at de.fraunhofer.iese.proopt.Template.run(Template.java:112)
        at de.fraunhofer.iese.proopt.Main.main(Main.java:8)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
        at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
        at org.apache.flink.client.program.Client.run(Client.java:315)
        at 
org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:582)
        at org.apache.flink.client.CliFrontend.run(CliFrontend.java:288)
        at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:878)
        at org.apache.flink.client.CliFrontend.main(CliFrontend.java:920)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Failed to 
submit job 29a2a25d49aa0706588ccdb8b7e81c6c (Flink Java Job at Sun Nov 08 
18:50:52 UTC 2015)
        at 
org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:594)
        at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:190)
        at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
        at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
        at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
        at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
        at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
        at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
        at 
org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:92)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
        at akka.dispatch.Mailbox.run(Mailbox.scala:221)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.JobException: Creating the input splits 
caused an error: No file system found with scheme s3n, referenced in file URI 
's3n://big-data-benchmark/pavlo/text/tiny/rankings'.
        at 
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:162)
        at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:469)
        at 
org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:534)
        ... 19 more
Caused by: java.io.IOException: No file system found with scheme s3n, 
referenced in file URI 's3n://big-data-benchmark/pavlo/text/tiny/rankings'.
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:247)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:309)
        at 
org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:447)
        at 
org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57)
        at 
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:146)
        ... 21 more

-- 
Viele Grüße

 
Thomas Götzinger

Freiberuflicher Informatiker

 
Glockenstraße 2a

D-66882 Hütschenhausen OT Spesbach

Mobil: +49 (0)176 82180714

Privat: +49 (0) 6371 954050

mailto:m...@simplydevelop.de <mailto:thomas.goetzin...@kajukin.de>
epost: thomas.goetzin...@epost.de <mailto:thomas.goetzin...@epost.de>




> On 08.11.2015, at 19:06, Thomas Götzinger <m...@simplydevelop.de> wrote:
> 
> HI Fabian,
> 
> thanks for reply. I use a karamel receipt to install flink on ec2.Currently I 
> am using flink-0.9.1-bin-hadoop24.tgz 
> <http://apache.mirrors.spacedump.net/flink/flink-0.9.1/flink-0.9.1-bin-hadoop24.tgz>.
> 
>  In that file the NativeS3FileSystem is included. First I’ve tried it with 
> the standard karamel receipt on github hopshadoop/flink-chef 
> <https://github.com/hopshadoop/flink-chef> but it’s on Version 0.9.0 and the 
> S3NFileSystem is not included.
> So I forked the github project by goetzingert/flink-chef
> Although the class file is include the application throws a 
> ClassNotFoundException for the class above.
> In my Project I add the conf/core-site.xml
> 
>   <property>
>     <name>fs.s3n.impl</name>
>     <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
>   </property>
>   <property>
>     <name>fs.s3n.awsAccessKeyId</name>
>     <value>….</value>
>   </property>
>   <property>
>     <name>fs.s3n.awsSecretAccessKey</name>
>     <value>...</value>
>   </property>
> 
> — 
> I also tried to use the programmatic configuration 
> 
>               XMLConfiguration config = new XMLConfiguration(configPath);
> 
>               env = ExecutionEnvironment.getExecutionEnvironment();
>               Configuration configuration = 
> GlobalConfiguration.getConfiguration();
>               configuration.setString("fs.s3.impl", 
> "org.apache.hadoop.fs.s3native.NativeS3FileSystem");
>               configuration.setString("fs.s3n.awsAccessKeyId", “..");
>               configuration.setString("fs.s3n.awsSecretAccessKey”,”../");
>               
> configuration.setString("fs.hdfs.hdfssite",Template.class.getResource("/conf/core-site.xml").toString());
>               GlobalConfiguration.includeConfiguration(configuration);
> 
> 
> Any Idea why the class is not included in classpath? Is there another script 
> to setup flink on ec2 cluster?
> 
> When will flink 0.10 be released? 
> 
> 
> 
> Regards
> 
>  
> Thomas Götzinger
> 
> Freiberuflicher Informatiker
> 
>  
> Glockenstraße 2a
> 
> D-66882 Hütschenhausen OT Spesbach
> 
> Mobil: +49 (0)176 82180714
> 
> Privat: +49 (0) 6371 954050
> 
> mailto:m...@simplydevelop.de <mailto:thomas.goetzin...@kajukin.de>
> epost: thomas.goetzin...@epost.de <mailto:thomas.goetzin...@epost.de>
> 
> 
> 
> 
>> On 29.10.2015, at 09:47, Fabian Hueske <fhue...@gmail.com 
>> <mailto:fhue...@gmail.com>> wrote:
>> 
>> Hi Thomas,
>> 
>> until recently, Flink provided an own implementation of a S3FileSystem which 
>> wasn't fully tested and buggy.
>> We removed that implementation and are using now (in 0.10-SNAPSHOT) Hadoop's 
>> S3 implementation by default.
>> 
>> If you want to continue using 0.9.1 you can configure Flink to use Hadoop's 
>> implementation. See this answer on StackOverflow and the linked email thread 
>> [1].
>> If you switch to the 0.10-SNAPSHOT version (which will be released in a few 
>> days as 0.10.0), things become a bit easier and Hadoop's implementation is 
>> used by default. The documentation shows how to configure your access keys 
>> [2]
>> 
>> Please don't hesitate to ask if something is unclear or not working.
>> 
>> Best, Fabian
>> 
>> [1] 
>> http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3 
>> <http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3>
>> [2] 
>> https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html
>>  
>> <https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html>
>> 
>> 2015-10-29 9:35 GMT+01:00 Thomas Götzinger <m...@simplydevelop.de 
>> <mailto:m...@simplydevelop.de>>:
>> Hello Flink Team,
>> 
>> We at IESE Fraunhofer are evaluating Flink for a project and I'm a bit 
>> frustrated in the moment. 
>> 
>> I've wrote a few testcases with the flink API and want to deploy them to an 
>> Flink EC2 Cluster. I setup the cluster using the 
>> karamel receipt which was adressed in the following video 
>> 
>> https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg
>>  
>> <https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg>
>> 
>> The setup works fine and the hello-flink app could be run. But afterwards I 
>> want to copy some data from s3 bucket to the local ec2 hdfs cluster. 
>> 
>> The hadoop fs -ls s3n.... works as well as cat,...
>> But if I want to copy the data with distcp the command freezes, and does not 
>> respond until a timeout.
>> 
>> After trying a few things I gave up and start another solution. I want to 
>> access the s3 Bucket directly with flink and import it using a small flink 
>> programm which just reads s3 and writes to local hadoop. This works fine 
>> locally, but on cluster the S3NFileSystem class is missing (ClassNotFound 
>> Exception) althoug it is included in the jar file of the installation. 
>> 
>> 
>> I forked the chef receipt and updated to flink 0.9.1 but the same issue.
>> 
>> Is there another simple script to install flink with hadoop on an ec2 
>> cluster and working s3n filesystem?
>> 
>> 
>> 
>> 
>> Freelancer 
>> 
>> on Behalf of Fraunhofer IESE Kaiserslautern
>> 
>> 
>> -- 
>> Viele Grüße
>> 
>>  
>> Thomas Götzinger
>> 
>> Freiberuflicher Informatiker
>> 
>>  
>> Glockenstraße 2a
>> 
>> D-66882 Hütschenhausen OT Spesbach
>> 
>> Mobil: +49 (0)176 82180714
>> 
>> Homezone: +49 (0) 6371 735083
>> 
>> Privat: +49 (0) 6371 954050
>> 
>> mailto:m...@simplydevelop.de <mailto:thomas.goetzin...@kajukin.de>
>> epost: thomas.goetzin...@epost.de <mailto:thomas.goetzin...@epost.de>
>

Re: Flink on EC"

Reply via email to