Hi Manjusha, If you are, for example, using one of Amazon's Linux AMIs on EMR, you may fall into a trap that Lasse described during his Flink Forward talk [1]: These images include a default cron job that cleans up files in /tmp which have not been recently accessed. The default BLOB server directory (blob.storage.directory) will store files under /tmp and on the JobManager, they are only accessed during deployments, so that falls under this cleanup detection. A solution is to change the BLOB storage directory.
Nico [1] https://data-artisans.com/flink-forward-berlin/resources/our-successful-journey-with-flink On 23/10/2018 10:27, Manjusha Vuyyuru wrote: > Hello, > > Checkpointing to hdfs. > *state.backend.fs.checkpointdir: hdfs://flink-hdfs:9000/flink-checkpoints* > *state.checkpoints.num-retained: 2* > * > * > Thanks, > Manjusha > > > On Tue, Oct 23, 2018 at 1:05 PM Dawid Wysakowicz <dwysakow...@apache.org > <mailto:dwysakow...@apache.org>> wrote: > > Hi Manjusha, > > I am not sure what is wrong, but Nico or Till (cc'ed) might be able > to help you. > > Best, > > Dawid > > On 23/10/2018 06:58, Manjusha Vuyyuru wrote: >> Hello All, >> >> I have a job which fails lets say after every 14 days with IO >> Exception, failed to fetch blob. >> I submitted the job using command line using java jar.Below is the >> exception I'm getting: >> >> java.io.IOException: Failed to fetch BLOB >> d23d168655dd51efe4764f9b22b85a18/p-446f7e0137fd66af062de7a56c55528171d380db-baf0b6bce698d586a3b0d30c6e487d16 >> from flink-job-mamager/10.20.1.85:38147 <http://10.20.1.85:38147> and store >> it under >> /tmp/blobStore-e3e34fec-22d9-4b3c-b542-0c1e5cdcf896/incoming/temp-00000022 >> at >> org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:191) >> at >> org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:177) >> at >> org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:205) >> at >> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:119) >> at >> org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:878) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: java.io.IOException: GET operation failed: Server side error: >> /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356 >> at >> org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:253) >> at >> org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:166) >> ... 6 more >> Caused by: java.io.IOException: Server side error: >> /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356 >> at >> org.apache.flink.runtime.blob.BlobClient.receiveAndCheckGetResponse(BlobClient.java:306) >> at >> org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:247) >> ... 7 more >> Caused by: java.nio.file.NoSuchFileException: >> /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356 >> at >> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) >> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) >> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) >> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) >> at >> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) >> at java.nio.file.Files.move(Files.java:1395) >> at >> org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:452) >> at >> org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521) >> at >> org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231) >> at >> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117) >> All the configurations of blob are default, i didn't change anything. >> Can someone help me to fix this issue. >> Thanks, >> Manjusha > -- Nico Kruber | Software Engineer data Artisans Follow us @dataArtisans -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Stresemannstr. 121A,10963 Berlin, Germany data Artisans, Inc. | 1161 Mission Street, San Francisco, CA-94103, USA -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen