This looks suspicious, but it should actually be also a consequence of a
failure or disconnect between the TaskManager and the JobManager.

Can you send us the whole log to have a closer look?

Thanks,
Stephan


On Thu, May 21, 2015 at 10:59 AM, Flavio Pompermaier <pomperma...@okkam.it>
wrote:

> Could it be this the main failure reason?
>
> 09:45:58,650 WARN  akka.remote.ReliableDeliverySupervisor
>        - Association with remote system [akka.tcp://
> flink@192.168.234.83:6123] has failed, address is now gated for [5000]
> ms. Reason is: [Disassociated].
> 09:45:58,831 WARN  Remoting
>        - Tried to associate with unreachable remote address [akka.tcp://
> flink@192.168.234.83:6123]. Address is now gated for 5000 ms, all
> messages to this address will be delivered to dead letters. Reason: The
> remote system has quarantined this system. No further associations to the
> remote system are possible until this system is restarted.
> 09:45:58,889 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>        - Disconnecting from JobManager: JobManager is no longer reachable
> 09:45:58,893 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>        - Cancelling all computations and discarding all cached data.
>
>
> On Thu, May 21, 2015 at 9:57 AM, Stephan Ewen <se...@apache.org> wrote:
>
>> Hi!
>>
>> Interruptions usually happen as part of cancelling. Has the job failed
>> for some other reason (and that exception is only a followup) ?
>> Or it this the root cause of the failure.
>>
>> Stephan
>>
>>
>>
>> On Thu, May 21, 2015 at 9:55 AM, Flavio Pompermaier <pomperma...@okkam.it
>> > wrote:
>>
>>> Now I'm able to run my job but after a while I get this other exception:
>>>
>>> 09:43:49,383 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>>          - Unregistering task and sending final execution state FINISHED to
>>> JobManager for task CHAIN DataSource (at
>>> createInput(ExecutionEnvironment.java:490)
>>> (org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormat)) -> FlatMap
>>> (FlatMap at readTuplesFromThriftParquet(MyParquetThriftClass.java:94))
>>> (a216cedd838190aebf3849fffe7fe576)
>>> 09:45:50,205 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>>          - Discarding the results produced by task execution
>>> c088f1c46c6e823cd9cc90f0e679696c
>>>  ERROR org.apache.flink.runtime.io.network.partition.ResultPartition  -
>>> Error during release of result subpartition: Closing of asynchronous file
>>> channel was interrupted.
>>> java.io.IOException: Closing of asynchronous file channel was
>>> interrupted.
>>>         at
>>> org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.close(AsynchronousFileIOChannel.java:130)
>>>         at
>>> org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.closeAndDelete(AsynchronousFileIOChannel.java:158)
>>>         at
>>> org.apache.flink.runtime.io.network.partition.SpillableSubpartition.release(SpillableSubpartition.java:130)
>>>         at
>>> org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:288)
>>>         at
>>> org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartitionsProducedBy(ResultPartitionManager.java:91)
>>>         at
>>> org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:329)
>>>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:648)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>> Any ideas?
>>>
>>> On Wed, May 20, 2015 at 9:11 PM, Flavio Pompermaier <
>>> pomperma...@okkam.it> wrote:
>>>
>>>> Thank you Stephan!I'll let you know tomorrow!
>>>> On May 20, 2015 7:30 PM, "Stephan Ewen" <se...@apache.org> wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> I pushed a fix to the master that should solve this.
>>>>>
>>>>> It probably needs a bit until the snapshot repositories are synced.
>>>>>
>>>>> Let me know if it fixed your issue!
>>>>>
>>>>> Greetings,
>>>>> Stephan
>>>>>
>>>>>
>>>>> On Wed, May 20, 2015 at 1:48 PM, Flavio Pompermaier <
>>>>> pomperma...@okkam.it> wrote:
>>>>>
>>>>>> Here it is:
>>>>>>
>>>>>> java.lang.RuntimeException: Requesting the next InputSplit failed.
>>>>>> at
>>>>>> org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:89)
>>>>>> at
>>>>>> org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:340)
>>>>>> at
>>>>>> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:139)
>>>>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>> Caused by: java.lang.RuntimeException: Unable to create InputSplit
>>>>>> at
>>>>>> org.apache.flink.api.java.hadoop.mapreduce.wrapper.HadoopInputSplit.readObject(HadoopInputSplit.java:108)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at
>>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>>> at
>>>>>> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:302)
>>>>>> at
>>>>>> org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83)
>>>>>> ... 4 more
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> parquet.hadoop.ParquetInputSplit
>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>> at java.lang.Class.forName(Class.java:190)
>>>>>> at
>>>>>> org.apache.flink.api.java.hadoop.mapreduce.wrapper.HadoopInputSplit.readObject(HadoopInputSplit.java:104)
>>>>>> ... 15 more
>>>>>>
>>>>>>
>>>>>> On Wed, May 20, 2015 at 1:45 PM, Stephan Ewen <se...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> This is a bug in the HadoopInputSplit. It does not follow the
>>>>>>> general class loading rules in Flink. I think it is pretty 
>>>>>>> straightforward
>>>>>>> to fix, I'll give it a quick shot...
>>>>>>>
>>>>>>> Can you send me the entire stack trace (where the serialization call
>>>>>>> comes from) to verify this?
>>>>>>>
>>>>>>> On Wed, May 20, 2015 at 12:03 PM, Flavio Pompermaier <
>>>>>>> pomperma...@okkam.it> wrote:
>>>>>>>
>>>>>>>> Now I'm able to run the job but I get another exception..this time
>>>>>>>> it seems that Flink it's not able to split my Parquet file:
>>>>>>>>
>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>> parquet.hadoop.ParquetInputSplit
>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>>>> at java.lang.Class.forName(Class.java:190)
>>>>>>>> at
>>>>>>>> org.apache.flink.api.java.hadoop.mapreduce.wrapper.HadoopInputSplit.readObject(HadoopInputSplit.java:104)
>>>>>>>>
>>>>>>>> I checked the jar and that class is present in my "fat" jar.
>>>>>>>> What should I do now?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, May 20, 2015 at 10:57 AM, Flavio Pompermaier <
>>>>>>>> pomperma...@okkam.it> wrote:
>>>>>>>>
>>>>>>>>> Yes it could be that the jar classes and those on the cluster are
>>>>>>>>> not aligned for some days..Now I'll recompile both sides and if I 
>>>>>>>>> still
>>>>>>>>> have the error I will change line 42 as you suggested.
>>>>>>>>> Tanks Max
>>>>>>>>>
>>>>>>>>> On Wed, May 20, 2015 at 10:53 AM, Maximilian Michels <
>>>>>>>>> m...@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Flavio,
>>>>>>>>>>
>>>>>>>>>> It would be helpful, if we knew which class could not be found.
>>>>>>>>>> In the ClosureCleaner, can you change line 42 to include the class 
>>>>>>>>>> name in
>>>>>>>>>> the error message? Like in this example:
>>>>>>>>>>
>>>>>>>>>> private static ClassReader getClassReader(Class<?> cls) {
>>>>>>>>>>    String className = cls.getName().replaceFirst("^.*\\.", "") + 
>>>>>>>>>> ".class";
>>>>>>>>>>    try {
>>>>>>>>>>       return new ClassReader(cls.getResourceAsStream(className));
>>>>>>>>>>    } catch (IOException e) {
>>>>>>>>>>       throw new RuntimeException("Could not create ClassReader for 
>>>>>>>>>> class " + cls.getName() + ":" + e);
>>>>>>>>>>    }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Could it be that you're running an old job on the latest snapshot
>>>>>>>>>> version? This could cause class-related problems...
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Max
>>>>>>>>>>
>>>>>>>>>> On Wed, May 20, 2015 at 9:41 AM, Flavio Pompermaier <
>>>>>>>>>> pomperma...@okkam.it> wrote:
>>>>>>>>>>
>>>>>>>>>>> Any insight about this..?
>>>>>>>>>>>
>>>>>>>>>>> On Tue, May 19, 2015 at 12:49 PM, Flavio Pompermaier <
>>>>>>>>>>> pomperma...@okkam.it> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi to all,
>>>>>>>>>>>>
>>>>>>>>>>>> I tried to run my job on a brand new Flink cluster
>>>>>>>>>>>> (0.9-SNAPSHOT) from the web client UI using the shading strategy 
>>>>>>>>>>>> of the
>>>>>>>>>>>> quickstart example but I get this exception:
>>>>>>>>>>>>
>>>>>>>>>>>> Caused by: java.lang.RuntimeException: Could not create
>>>>>>>>>>>> ClassReader: java.io.IOException: Class not found
>>>>>>>>>>>> at
>>>>>>>>>>>> org.apache.flink.api.java.ClosureCleaner.getClassReader(ClosureCleaner.java:42)
>>>>>>>>>>>> at
>>>>>>>>>>>> org.apache.flink.api.java.ClosureCleaner.cleanThis0(ClosureCleaner.java:67)
>>>>>>>>>>>> at
>>>>>>>>>>>> org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:54)
>>>>>>>>>>>>
>>>>>>>>>>>> It seems that it cannot find some kryo class..how do I fix
>>>>>>>>>>>> this? this is my shade plugin section of pom.xml:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> <plugin>
>>>>>>>>>>>> <groupId>org.apache.maven.plugins</groupId>
>>>>>>>>>>>> <artifactId>maven-shade-plugin</artifactId>
>>>>>>>>>>>> <version>1.4</version>
>>>>>>>>>>>> <executions>
>>>>>>>>>>>> <execution>
>>>>>>>>>>>> <phase>package</phase>
>>>>>>>>>>>> <goals>
>>>>>>>>>>>> <goal>shade</goal>
>>>>>>>>>>>> </goals>
>>>>>>>>>>>> <configuration>
>>>>>>>>>>>> <artifactSet>
>>>>>>>>>>>> <excludes>
>>>>>>>>>>>> <!-- This list contains all dependencies of flink-dist
>>>>>>>>>>>> Everything
>>>>>>>>>>>> else will be packaged into the fat-jar -->
>>>>>>>>>>>> <exclude>org.apache.flink:flink-shaded-*</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-core</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-java</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-scala</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-runtime</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-optimizer</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-clients</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-spargel</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-avro</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-java-examples</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-scala-examples</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-streaming-examples</exclude>
>>>>>>>>>>>> <exclude>org.apache.flink:flink-streaming-core</exclude>
>>>>>>>>>>>>
>>>>>>>>>>>> <!-- Also exclude very big transitive dependencies of Flink
>>>>>>>>>>>> WARNING:
>>>>>>>>>>>> You have to remove these excludes if your code relies on other
>>>>>>>>>>>> versions of
>>>>>>>>>>>> these dependencies. -->
>>>>>>>>>>>> <exclude>org.scala-lang:scala-library</exclude>
>>>>>>>>>>>> <exclude>org.scala-lang:scala-compiler</exclude>
>>>>>>>>>>>> <exclude>org.scala-lang:scala-reflect</exclude>
>>>>>>>>>>>> <exclude>com.amazonaws:aws-java-sdk</exclude>
>>>>>>>>>>>> <exclude>com.typesafe.akka:akka-actor_*</exclude>
>>>>>>>>>>>> <exclude>com.typesafe.akka:akka-remote_*</exclude>
>>>>>>>>>>>> <exclude>com.typesafe.akka:akka-slf4j_*</exclude>
>>>>>>>>>>>> <exclude>io.netty:netty-all</exclude>
>>>>>>>>>>>> <exclude>io.netty:netty</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-server</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-continuation</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-http</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-io</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-util</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-security</exclude>
>>>>>>>>>>>> <exclude>org.eclipse.jetty:jetty-servlet</exclude>
>>>>>>>>>>>> <exclude>commons-fileupload:commons-fileupload</exclude>
>>>>>>>>>>>> <exclude>org.apache.avro:avro</exclude>
>>>>>>>>>>>> <exclude>commons-collections:commons-collections</exclude>
>>>>>>>>>>>> <exclude>org.codehaus.jackson:jackson-core-asl</exclude>
>>>>>>>>>>>> <exclude>org.codehaus.jackson:jackson-mapper-asl</exclude>
>>>>>>>>>>>> <exclude>com.thoughtworks.paranamer:paranamer</exclude>
>>>>>>>>>>>> <exclude>org.xerial.snappy:snappy-java</exclude>
>>>>>>>>>>>> <exclude>org.apache.commons:commons-compress</exclude>
>>>>>>>>>>>> <exclude>org.tukaani:xz</exclude>
>>>>>>>>>>>> <exclude>com.esotericsoftware.kryo:kryo</exclude>
>>>>>>>>>>>> <exclude>com.esotericsoftware.minlog:minlog</exclude>
>>>>>>>>>>>> <exclude>org.objenesis:objenesis</exclude>
>>>>>>>>>>>> <exclude>com.twitter:chill_*</exclude>
>>>>>>>>>>>> <exclude>com.twitter:chill-java</exclude>
>>>>>>>>>>>> <exclude>com.twitter:chill-avro_*</exclude>
>>>>>>>>>>>> <exclude>com.twitter:chill-bijection_*</exclude>
>>>>>>>>>>>> <exclude>com.twitter:bijection-core_*</exclude>
>>>>>>>>>>>> <exclude>com.twitter:bijection-avro_*</exclude>
>>>>>>>>>>>> <exclude>com.twitter:chill-protobuf</exclude>
>>>>>>>>>>>> <exclude>com.google.protobuf:protobuf-java</exclude>
>>>>>>>>>>>> <exclude>com.twitter:chill-thrift</exclude>
>>>>>>>>>>>> <exclude>org.apache.thrift:libthrift</exclude>
>>>>>>>>>>>> <exclude>commons-lang:commons-lang</exclude>
>>>>>>>>>>>> <exclude>junit:junit</exclude>
>>>>>>>>>>>> <exclude>de.javakaffee:kryo-serializers</exclude>
>>>>>>>>>>>> <exclude>joda-time:joda-time</exclude>
>>>>>>>>>>>> <exclude>org.apache.commons:commons-lang3</exclude>
>>>>>>>>>>>> <exclude>org.slf4j:slf4j-api</exclude>
>>>>>>>>>>>> <exclude>org.slf4j:slf4j-log4j12</exclude>
>>>>>>>>>>>> <exclude>log4j:log4j</exclude>
>>>>>>>>>>>> <exclude>org.apache.commons:commons-math</exclude>
>>>>>>>>>>>>
>>>>>>>>>>>> <exclude>org.apache.sling:org.apache.sling.commons.json</exclude>
>>>>>>>>>>>> <exclude>commons-logging:commons-logging</exclude>
>>>>>>>>>>>> <exclude>org.apache.httpcomponents:httpclient</exclude>
>>>>>>>>>>>> <exclude>org.apache.httpcomponents:httpcore</exclude>
>>>>>>>>>>>> <exclude>commons-codec:commons-codec</exclude>
>>>>>>>>>>>> <exclude>com.fasterxml.jackson.core:jackson-core</exclude>
>>>>>>>>>>>> <exclude>com.fasterxml.jackson.core:jackson-databind</exclude>
>>>>>>>>>>>>
>>>>>>>>>>>> <exclude>com.fasterxml.jackson.core:jackson-annotations</exclude>
>>>>>>>>>>>> <exclude>org.codehaus.jettison:jettison</exclude>
>>>>>>>>>>>> <exclude>stax:stax-api</exclude>
>>>>>>>>>>>> <exclude>com.typesafe:config</exclude>
>>>>>>>>>>>> <exclude>org.uncommons.maths:uncommons-maths</exclude>
>>>>>>>>>>>> <exclude>com.github.scopt:scopt_*</exclude>
>>>>>>>>>>>> <exclude>org.mortbay.jetty:servlet-api</exclude>
>>>>>>>>>>>> <exclude>commons-io:commons-io</exclude>
>>>>>>>>>>>> <exclude>commons-cli:commons-cli</exclude>
>>>>>>>>>>>> </excludes>
>>>>>>>>>>>> </artifactSet>
>>>>>>>>>>>> <filters>
>>>>>>>>>>>> <filter>
>>>>>>>>>>>> <artifact>org.apache.flink:*</artifact>
>>>>>>>>>>>> <excludes>
>>>>>>>>>>>> <exclude>org/apache/flink/shaded/**</exclude>
>>>>>>>>>>>> <exclude>web-docs/**</exclude>
>>>>>>>>>>>> </excludes>
>>>>>>>>>>>> </filter>
>>>>>>>>>>>> </filters>
>>>>>>>>>>>> <createDependencyReducedPom>false</createDependencyReducedPom>
>>>>>>>>>>>> <finalName>XXXX</finalName>
>>>>>>>>>>>> <transformers>
>>>>>>>>>>>> <!-- add Main-Class to manifest file -->
>>>>>>>>>>>> <transformer
>>>>>>>>>>>>
>>>>>>>>>>>> implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
>>>>>>>>>>>> <manifestEntries>
>>>>>>>>>>>> <Main-Class>XXX</Main-Class>
>>>>>>>>>>>> </manifestEntries>
>>>>>>>>>>>> </transformer>
>>>>>>>>>>>> </transformers>
>>>>>>>>>>>> </configuration>
>>>>>>>>>>>> </execution>
>>>>>>>>>>>> </executions>
>>>>>>>>>>>> </plugin>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>>
>>
>

Reply via email to