Hi,

I'm having some troubles running the Flink taskmanager in a Docker container (OpenShift). The container's internal storage is filling up because the deleted jar files in blob storage are probably still in use and therefore resources are not free'ed.

We are using Apache Beam to start an Apache Flink process, so the Jars are sent to Apache Flink everytime we fire a batch.

I enabled the debug logging, but I can't seem to find anything showing these deletes. Maybe someone has an idea why resources are not free'ed? I checked the blob store, and it indeed are the jars.

208875129    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/142 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_90964be94a2f4471844a00284e44fb32/blob_p-5202910b36af8c12548df97a7e4a057b77786217-ffa3f85003b1f124cd1cccdb0f72a8e0\
 (deleted)

208875130    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/143 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_b7c00268b488411a8f6e1af984bcdcc2/blob_p-5202910b36af8c12548df97a7e4a057b77786217-8bab07adb34d4ce8de20846ec72059ce\
 (deleted)

208875131    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/144 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_46183ac02f1dcd3543f8e481f59948b5/blob_p-5202910b36af8c12548df97a7e4a057b77786217-ac6bc86d8932e7d631416d9bafab4ab4\
 (deleted)

208875132    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/145 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_717bf3f4b3f80700c1cc44d6076c2aca/blob_p-5202910b36af8c12548df97a7e4a057b77786217-780dd2383dee11a2361ac20a5da7bbb8\
 (deleted)

208875133    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/146 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_22e67caac65c9c4e537caa3b072b8cc3/blob_p-5202910b36af8c12548df97a7e4a057b77786217-e0b523663672c641b368e5d1440b0b70\
 (deleted)

208875134    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/147 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_3afe5b02ccb95b3494a1acd8677c66f0/blob_p-5202910b36af8c12548df97a7e4a057b77786217-9a8cd48c09a4b518adf0309a0255b339\
 (deleted)

208875135    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/148 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_cb024c561531905e81c9768ec62a2fe0/blob_p-5202910b36af8c12548df97a7e4a057b77786217-0addc83aaf9a2f781528ad035fd79cc8\
 (deleted)

208875136    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/149 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_d3dc0b0608d71ffa77575771f088e80e/blob_p-5202910b36af8c12548df97a7e4a057b77786217-c9015b012ec4b249f32872471a31a500\
 (deleted)

208875137    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/150 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_1b4cdb127bb2c345e1b099e3e446bf58/blob_p-5202910b36af8c12548df97a7e4a057b77786217-ac4457b393b7ff0565c47c1e38786005\
 (deleted)

208875138    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/151 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_8c23503c614a88e8c8f7a54a31e41886/blob_p-5202910b36af8c12548df97a7e4a057b77786217-d096b3ef150bf7e8e98224e0b8f17292\
 (deleted)

208875139    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/152 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_e7c8132da483bd14e5abfe9390adeeb1/blob_p-5202910b36af8c12548df97a7e4a057b77786217-f370d8dcad0cb36581f9a5f1568e1487\
 (deleted)

208875140    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/153 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_cbee9f15b0c6adba0f5ddb67b587b607/blob_p-5202910b36af8c12548df97a7e4a057b77786217-9ae77c3419d77adab8f44258ca4290c5\
 (deleted)

208875141    0 lr-x------   1 1000150000 root           64 Apr 18 12:58 
/proc/1/fd/154 -> 
/var/tmp/flink/blobStore-580cc38d-44e4-45a1-8922-e21c00d73dec/job_29c5a145ae231be4c0d53717625c3938/blob_p-5202910b36af8c12548df97a7e4a057b77786217-76bb4d83f962a887d41effb2646bd63d\
 (deleted)



There are several places in the code where the returned boolean of the file delete is not read, so we have no clue if the file was deleted succesfully. Maybe it can be changed to something like java.nio.file.Files.delete to get an IOException when something goes wrong.  Though this is not a solution, but it can make it more transparent when things go wrong.

Thanks,
Jeroen

Reply via email to