Hi guys,
I have a deduplication job that runs on flink 1.7, that has some state
which uses FsState backend. My TM heap size is 16 GB. I see the below error
while trying to checkpoint a state of size 2GB. There is enough space
available in s3, I tried to upload larger files and they were all
successful. There is also enough disk space in the local file system, the
disk utility tool does not show anything suspicious. Whenever I try to
start my job from the last successful checkpoint , it runs into the same
error. Can someone tell me what is the cause of this issue? Many thanks.


Note: This error goes away when I delete io.tmp.dirs and restart the job
from last checkpoint , but the disk utility tool does not show much usage
before deletion, so I am not able to figure out what the problem is.

2020-08-19 21:12:01,909 WARN
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory - Could
not close the state stream for s3p://featuretoolkit.c
heckpoints/dev_dedup/9b64aafadcd6d367cfedef84706abcba/chk-189/f8e668dd-8019-4830-ab12-d48940ff5353.
1363 java.io.IOException: No space left on device
1364 at java.io.FileOutputStream.writeBytes(Native Method)
1365 at java.io.FileOutputStream.write(FileOutputStream.java:326)
1366 at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
1367 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
1368 at java.io.FilterOutputStream.flush(FilterOutputStream.java:140)
1369 at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
1370 at
org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.close(PrestoS3FileSystem.java:986)
1371 at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
1372 at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
1373 at
org.apache.flink.fs.s3.common.hadoop.HadoopDataOutputStream.close(HadoopDataOutputStream.java:52)
1374 at
org.apache.flink.core.fs.ClosingFSDataOutputStream.close(ClosingFSDataOutputStream.java:64)
1375 at
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.close(FsCheckpointStreamFactory.java:269)
1376 at
org.apache.flink.runtime.state.CheckpointStreamWithResultProvider.close(CheckpointStreamWithResultProvider.java:58)
1377 at org.apache.flink.util.IOUtils.closeQuietly(IOUtils.java:263)
1378 at org.apache.flink.util.IOUtils.closeAllQuietly(IOUtils.java:250)
1379 at
org.apache.flink.util.AbstractCloseableRegistry.close(AbstractCloseableRegistry.java:122)
1380 at
org.apache.flink.runtime.state.AsyncSnapshotCallable.closeSnapshotIO(AsyncSnapshotCallable.java:185)
1381 at
org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:84)
1382 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
1383 at
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:50)
1384 at
org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:47)
1385 at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:853)
1386 at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
1387 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
1388 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
1389 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
1390 at java.lang.Thread.run(Thread.java:748)
1391 Suppressed: java.io.IOException: No space left on device
1392 at java.io.FileOutputStream.writeBytes(Native Method)
1393 at java.io.FileOutputStream.write(FileOutputStream.java:326)
1394 at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
1395 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
1396 at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
1397 at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
1398 ... 21 more


Thanks,
Vishwas

Reply via email to