Hi
I'm trying to copy data from kafka to HDFS . The data in HDFS is used to do other computations by others in map/reduce. If some tasks failed, the ".valid-length" file is created for the low version hadoop. The problem is other people must know how to deal with the ".valid-length" file, otherwise, the data may be not exactly-once. Hence, why not rewrite a new file when restoring instead of writing a ".valid-length" file. In this way, others who use the data in HDFS don't need to know how to deal with the ".valid-length" file. Thanks! Zhang Xinyu