The job uses a RolllingFileSink to push data to hdfs. Run an HA standalone
cluster on k8s,

* get the job running
* kill the pod.

The k8s deployment relaunches the pod but fails with

java.io.IOException: Missing data in tmp file:
hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-02-14/evt=ad_fill/.part-2-16.inprogress.449e8668-e886-4f89-b5f6-45ac68e25987


Unknown method truncate called on
org.apache.hadoop.hdfs.protocol.ClientProtocol protocol.


The file does exist. We work with hadoop 2.6 , which does no have truncate.
The previous version would see that "truncate" was not supported and drop a
length file for the ,inprogress file and rename it to a valid part file.



Is this a known issue ?


Regards.

Reply via email to