The file header says key is NullWritable: SEQ^F!org.apache.hadoop.io.NullWritable^Yorg.apache.hadoop.io.Text^A^A)org.apache.hadoop.io.compress.SnappyCodec
Might be a hadoop -text problem? On Sat, 27 Jul 2019 at 11:07, Liu Bo <diabl...@gmail.com> wrote: > Dear flink users, > > We're trying to switch from StringWriter to SequenceFileWriter to turn on > compression. StringWriter writes value only and we want to keep that way. > > AFAIK, you can use NullWritable in Hadoop writers to escape key so you > only write the values. > > So I tried with NullWritable as following code: > > BucketingSink<Tuple2<NullWritable, Text>> hdfsSink = new BucketingSink( > "/data/cjv"); > > hdfsSink.setBucketer(new DateTimeBucketer<>("yyyy-MM-dd/HH", > ZoneOffset.UTC)); > hdfsSink.setWriter(new SequenceFileWriter<NullWritable, > Text>("org.apache.hadoop.io.compress.SnappyCodec", > SequenceFile.CompressionType.BLOCK)); > hdfsSink.setBatchSize(1024 * 1024 * 250); > hdfsSink.setBatchRolloverInterval(20 * 60 * 1000); > > > joinedResults.map(new MapFunction<Tuple2<String, String>, > Tuple2<NullWritable, Text>>() { > > @Override > public Tuple2<NullWritable, Text> map(Tuple2<String, String> value) > throws Exception { > return Tuple2.of(NullWritable.get(), new Text(value.f1)); > } > }).addSink(hdfsSink).name("hdfs_sink").uid("hdfs_sink"); > > > But out put file has key as string value (null) > > eg: > > (null) {"ts":1564168038,"os":"android",...} > > > So my question is how to escape the key completely and write value only in > SequenceFileWriter? > > Your help will be much of my appreciation. > > > -- > All the best > > Liu Bo > -- All the best Liu Bo