That log does not appear. It looks like we have egg and chicken issue.

2019-02-15 16:49:15,045 DEBUG org.apache.hadoop.hdfs.DFSClient
                - Connecting to datanode

2019-02-15 16:49:15,045 DEBUG
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient  -
SASL client skipping handshake in unsecured configuration for

addr = /, datanodeId = DatanodeInfoWithStorage[,DS-c57a7667-f697-4f03-9fb1-532c5b82a9e8,DISK]

2019-02-15 16:49:15,072 DEBUG
org.apache.flink.runtime.fs.hdfs.HadoopFsFactory              -
Instantiating for file system scheme hdfs Hadoop File System

2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal
                - dfs.client.use.legacy.blockreader.local = false

2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal
                - = false

2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal
                - = false

2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal
                - dfs.domain.socket.path =

2019-02-15 16:49:15,076 DEBUG
                - multipleLinearRandomRetry = null

2019-02-15 16:49:15,076 DEBUG org.apache.hadoop.ipc.Client
                - getting client out of cache:

2019-02-15 16:49:15,076 DEBUG
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil  -
DataTransferProtocol not using SaslPropertiesResolver, no QOP found in
configuration for

2019-02-15 16:49:15,080 INFO
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets  - Subtask
3 initializing its state (max part counter=58).

2019-02-15 16:49:15,081 DEBUG
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets  - Subtask
3 restoring: BucketState for
bucketId=ls_kraken_events/dt=2019-02-14/evt=ad_fill and
has open part file created @ 1550247946437

2019-02-15 16:49:15,085 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1270836494) connection to from root sending #56

2019-02-15 16:49:15,188 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1270836494) connection to from root got value #56

2019-02-15 16:49:15,196 INFO  org.apache.flink.runtime.taskmanager.Task
                - Source: Custom Source -> (Sink: Unnamed, Process ->
Timestamps/Watermarks) (4/4) (f73403ac4763c99e6a244cba3797f7e9) switched
from RUNNING to FAILED. Missing data in tmp file:


I do see

2019-02-15 16:47:33,582 INFO
      -  Current Hadoop/Kerberos user: root

2019-02-15 16:47:33,582 INFO
      -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.181-b13

2019-02-15 16:47:33,582 INFO
      -  Maximum heap size: 1204 MiBytes

2019-02-15 16:47:33,582 INFO
      -  JAVA_HOME: /docker-java-home

2019-02-15 16:47:33,585 INFO
      -  Hadoop version: 2.7.5

which has to be expected given that we are running the hadoop27flink 1.7.1

Does it make sense to go with a hadoop less version and inject the required
jar files ?  Has that been done by anyone ?

On Fri, Feb 15, 2019 at 2:33 AM Yun Tang <> wrote:

> Hi
> When 'RollingSink' try to initialize state, it would first check current
> file system supported truncate method. If file system not supported, it
> would use another work-around solution, which means you should not meet the
> problem. Otherwise 'RollingSink' thought and found the reflection method of
> 'truncate' while the file system actually not support. You could try to
> open DEBUG level to see whether log below could  be printed:
> Truncate not found. Will write a file with suffix '.valid-length' and
> prefix '_' to specify how many bytes in a bucket are valid.
> However, from your second email, the more serious problem should be using
> 'Buckets' with Hadoop-2.6. From what I know the `RecoverableWriter` within
> 'Buckets' can only support Hadoop-2.7+ , I'm not sure whether existed work
> around solution.
> Best
> Yun Tang
> ------------------------------
> *From:* Vishal Santoshi <>
> *Sent:* Friday, February 15, 2019 3:43
> *To:* user
> *Subject:* Re: StandAlone job on k8s fails with "Unknown method truncate"
> on restore
> And yes  cannot work with RollingFleSink for hadoop 2.6 release of 1.7.1
> b'coz of this.
> java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are 
> only supported for HDFS and for Hadoop version 2.7 or newer
>       at 
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.<init>(
>       at 
> org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.createRecoverableWriter(
>       at 
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.createRecoverableWriter(
>       at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.<init>(
> Any work around ?
> On Thu, Feb 14, 2019 at 1:42 PM Vishal Santoshi <>
> wrote:
> The job uses a RolllingFileSink to push data to hdfs. Run an HA standalone
> cluster on k8s,
> * get the job running
> * kill the pod.
> The k8s deployment relaunches the pod but fails with
> Missing data in tmp file:
> hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-02-14/evt=ad_fill/.part-2-16.inprogress.449e8668-e886-4f89-b5f6-45ac68e25987
> Unknown method truncate called on
> org.apache.hadoop.hdfs.protocol.ClientProtocol protocol.
> The file does exist. We work with hadoop 2.6 , which does no have
> truncate. The previous version would see that "truncate" was not supported
> and drop a length file for the ,inprogress file and rename it to a valid
> part file.
> Is this a known issue ?
> Regards.

Reply via email to