Hi,
I’d like to sink my data into hdfs using SequenceFileAsBinaryOutputFormat with compression, and I find a way from the link https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/hadoop_compatibility.html, the code works, but I’m curious to know, since it creates a mapreduce Job instance here, would this Flink application creates and run a mapreduce underneath? If so, will it kill performance? I tried to figure out by looking into log, but couldn’t get a clue, hope people could shed some light here. Thank you. Job job = Job.getInstance(); HadoopOutputFormat<BytesWritable, BytesWritable> hadoopOF = new HadoopOutputFormat<BytesWritable, BytesWritable>( new SequenceFileAsBinaryOutputFormat(), job); hadoopOF.getConfiguration().set("mapreduce.output.fileoutputformat.compress", "true"); hadoopOF.getConfiguration().set("mapreduce.output.fileoutputformat.compress.type", CompressionType.BLOCK.toString()); TextOutputFormat.setOutputPath(job, new Path("hdfs://...")); dataset.output(hadoopOF);