[jira] [Commented] (FLINK-4150) Problem with Blobstore in Yarn HA setting on recovery after cluster shutdown

ASF GitHub Bot (JIRA) Wed, 20 Jul 2016 04:44:07 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385713#comment-15385713
 ]


ASF GitHub Bot commented on FLINK-4150:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2256#discussion_r71509039
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java
 ---
    @@ -77,7 +77,7 @@ public BlobLibraryCacheManager(BlobService blobService, 
long cleanupInterval) {
     
                // Initializing the clean up task
                this.cleanupTimer = new Timer(true);
    -           this.cleanupTimer.schedule(this, cleanupInterval);
    +           this.cleanupTimer.schedule(this, cleanupInterval, 
cleanupInterval);
    --- End diff --
    
    Good catch :+1:


> Problem with Blobstore in Yarn HA setting on recovery after cluster shutdown
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-4150
>                 URL: https://issues.apache.org/jira/browse/FLINK-4150
>             Project: Flink
>          Issue Type: Bug
>          Components: Job-Submission
>            Reporter: Stefan Richter
>            Assignee: Ufuk Celebi
>            Priority: Blocker
>             Fix For: 1.1.0
>
>
> Submitting a job in Yarn with HA can lead to the following exception:
> {code}
> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot load 
> user class: org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
> ClassLoader info: URL ClassLoader:
>     file: 
> '/tmp/blobStore-ccec0f4a-3e07-455f-945b-4fcd08f5bac1/cache/blob_7fafffe9595cd06aff213b81b5da7b1682e1d6b0'
>  (invalid JAR: zip file is empty)
> Class not resolvable through given classloader.
>       at 
> org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:207)
>       at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:222)
>       at org.apache.flink.runtime.taskmanager.Task.run(Task.java:588)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> Some job information, including the Blob ids, are stored in Zookeeper. The 
> actual Blobs are stored in a dedicated BlobStore, if the recovery mode is set 
> to Zookeeper. This BlobStore is typically located in a FS like HDFS. When the 
> cluster is shut down, the path for the BlobStore is deleted. When the cluster 
> is then restarted, recovering jobs cannot restore because it's Blob ids 
> stored in Zookeeper now point to deleted files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4150) Problem with Blobstore in Yarn HA setting on recovery after cluster shutdown

Reply via email to