Hi, I have a rather simple Flink job which has a KinesisConsumer as a source and an HBase table as sink, in which I write using writeOutputFormat. I'm running it on a local machine with a single taskmanager (2 slots, 2G). The KinesisConsumer works fine and the connection to the HBase table gets opened fine (i.e. the open method of the class implementing OutputFormat gets actually called).
I'm running the job at a parallelism of 2, while the sink has a parallelism of 1. The Still, looking at the log I see that after opening the connection, the job gets stuck at lines like this one: INFO org.apache.flink.runtime.blob.BlobCache - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268 Each following one another, like this: 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob.BlobCache - Created BLOB cache storage directory /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:59937 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:59976 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:18:13,710 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60022 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:19:29,814 INFO org.apache.flink.runtime.blob.BlobCache - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60110 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:21:42,858 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60295 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:26:11,247 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60410 My questions are: what is the jobmanager doing here? Why is he taking ages to do this? How do i speed up this behaviour? Thank you very much for your attention, Federico D'Ambrosio