RCFile issues ------------- Key: HIVE-2065 URL: https://issues.apache.org/jira/browse/HIVE-2065 Project: Hive Issue Type: Bug Reporter: Krishna Kumar Priority: Minor
Some potential issues with RCFile 1. Remove unwanted synchronized modifiers on the methods of RCFile. As per yongqiang he, the class is not meant to be thread-safe (and it is not). Might as well get rid of the confusing and performance-impacting lock acquisitions. 2. Record Length overstated for compressed files. IIUC, the key compression happens after we have written the record length. {code} int keyLength = key.getSize(); if (keyLength < 0) { throw new IOException("negative length keys not allowed: " + key); } out.writeInt(keyLength + valueLength); // total record length out.writeInt(keyLength); // key portion length if (!isCompressed()) { out.writeInt(keyLength); key.write(out); // key } else { keyCompressionBuffer.reset(); keyDeflateFilter.resetState(); key.write(keyDeflateOut); keyDeflateOut.flush(); keyDeflateFilter.finish(); int compressedKeyLen = keyCompressionBuffer.getLength(); out.writeInt(compressedKeyLen); out.write(keyCompressionBuffer.getData(), 0, compressedKeyLen); } {code} 3. For sequence file compatibility, the compressed key length should be the next field to record length, not the uncompressed key length. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira