Gopal V created HIVE-4423: ----------------------------- Summary: Improve RCFile::sync(long) 10x Key: HIVE-4423 URL: https://issues.apache.org/jira/browse/HIVE-4423 Project: Hive Issue Type: Improvement Environment: Ubuntu LXC (1 SSD, 1 disk, 32 gigs of RAM) Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 0.11.0
RCFile::sync(long) takes approx ~1 second everytime it gets called because of the inner loops in the function. >From what was observed with HDFS-4710, single byte reads are an order of >magnitude slower than larger 512 byte buffer reads. Even when disk I/O is buffered to this size, there is overhead due to the synchronized read() methods in BlockReaderLocal & RemoteBlockReader classes. Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte) call will speed this function >10x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira