Hi.

We sometimes see job fails with a blob store exception, like the one below.
Anyone has an idea why we get them, and how to avoid them?.
In this case the job has run without any problems for a week and then we
get the error. Only this job are affected right now all other running as
expected and next time it can be one of the other jobs that get the
exception.

We running Flink 1.4.2, on AWS EMR cluster, but we have seen the same
problems on 1.3.2 too.

Anyone

java.io.IOException: Failed to fetch BLOB
ff5d324719fb4caf3a0dba3fbcfa795e/p-812d84ea013302dbd24da1d32e732cc01582dabc-3198b6f63d293d2756f4cf5b8eebe7a2
from ip-10-1-1-192.eu-west-1.compute.internal/10.1.1.192:46781 and
store it under 
/tmp/blobStore-3e90d7b0-2f40-4e28-b2b0-01d9ba96ac55/incoming/temp-00000173
        at 
org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:191)
        at 
org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:177)
        at 
org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:205)
        at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:119)
        at 
org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:878)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: GET operation failed: Server side
error: 
/tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
        at 
org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:253)
        at 
org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:166)
        ... 6 more
Caused by: java.io.IOException: Server side error:
/tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
        at 
org.apache.flink.runtime.blob.BlobClient.receiveAndCheckGetResponse(BlobClient.java:306)
        at 
org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:247)
        ... 7 more
Caused by: java.nio.file.NoSuchFileException:
/tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
        at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
        at 
sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
        at java.nio.file.Files.move(Files.java:1395)
        at 
org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:452)
        at 
org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521)
        at 
org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231)
        at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117)

Reply via email to