skyskyhu opened a new pull request, #6798:
URL: https://github.com/apache/hadoop/pull/6798
HDFS-17510 Change of Codec configuration does not work
### Description of PR
In one of my projects, I need to dynamically adjust compression level for
different files.
However, I found that in most cases the new compression level does not take
effect as expected, the old compression level continues to be used.
Here is the relevant code snippet:
ZStandardCodec zStandardCodec = new ZStandardCodec();
zStandardCodec.setConf(conf);
conf.set("io.compression.codec.zstd.level", "5"); // level may change
dynamically
conf.set("io.compression.codec.zstd", zStandardCodec.getClass().getName());
writer = SequenceFile.createWriter(conf,
SequenceFile.Writer.file(sequenceFilePath),
SequenceFile.Writer.keyClass(LongWritable.class),
SequenceFile.Writer.valueClass(BytesWritable.class),
SequenceFile.Writer.compression(CompressionType.BLOCK));
The reason is SequenceFile.Writer.init() method will call
CodecPool.getCompressor(codec, null) to get a compressor.
If the compressor is a reused instance, the conf is not applied because it
is passed as null:
public static Compressor getCompressor(CompressionCodec codec, Configuration
conf) {
Compressor compressor = borrow(compressorPool, codec.getCompressorType());
if (compressor == null)
{ compressor = codec.createCompressor(); LOG.info("Got brand-new compressor
["+codec.getDefaultExtension()+"]"); }
else {
compressor.reinit(conf); //conf is null here
......
Please also refer to my unit test to reproduce the bug.
To address this bug, I modified the code to ensure that the configuration is
read back from the codec when a compressor is reused.
### How was this patch tested?
unit test
### For code changes:
- [Y] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]