The file header says key is NullWritable:

SEQ^F!org.apache.hadoop.io.NullWritable^Yorg.apache.hadoop.io.Text^A^A)org.apache.hadoop.io.compress.SnappyCodec

Might be a hadoop -text problem?

On Sat, 27 Jul 2019 at 11:07, Liu Bo <diabl...@gmail.com> wrote:

> Dear flink users,
>
> We're trying to switch from StringWriter to SequenceFileWriter to turn on
> compression. StringWriter writes value only and we want to keep that way.
>
> AFAIK, you can use NullWritable in Hadoop writers to escape key so you
> only write the values.
>
> So I tried with NullWritable as following code:
>
>    BucketingSink<Tuple2<NullWritable, Text>> hdfsSink = new BucketingSink(
> "/data/cjv");
>
>   hdfsSink.setBucketer(new DateTimeBucketer<>("yyyy-MM-dd/HH", 
> ZoneOffset.UTC));
>   hdfsSink.setWriter(new SequenceFileWriter<NullWritable, 
> Text>("org.apache.hadoop.io.compress.SnappyCodec", 
> SequenceFile.CompressionType.BLOCK));
>   hdfsSink.setBatchSize(1024 * 1024 * 250);
>   hdfsSink.setBatchRolloverInterval(20 * 60 * 1000);
>
>
>    joinedResults.map(new MapFunction<Tuple2<String, String>,
> Tuple2<NullWritable, Text>>() {
>
>     @Override
>     public Tuple2<NullWritable, Text> map(Tuple2<String, String> value) 
> throws Exception {
>         return Tuple2.of(NullWritable.get(), new Text(value.f1));
>     }
> }).addSink(hdfsSink).name("hdfs_sink").uid("hdfs_sink");
>
>
> But out put file has key as string value (null)
>
> eg:
>
>     (null)    {"ts":1564168038,"os":"android",...}
>
>
> So my question is how to escape the key completely and write value only in 
> SequenceFileWriter?
>
> Your help will be much of my appreciation.
>
>
> --
> All the best
>
> Liu Bo
>


-- 
All the best

Liu Bo

Reply via email to