[ https://issues.apache.org/jira/browse/HIVE-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ashutosh Chauhan updated HIVE-4423: ----------------------------------- Resolution: Fixed Fix Version/s: (was: 0.11.0) 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gopal! > Improve RCFile::sync(long) 10x > ------------------------------ > > Key: HIVE-4423 > URL: https://issues.apache.org/jira/browse/HIVE-4423 > Project: Hive > Issue Type: Improvement > Environment: Ubuntu LXC (1 SSD, 1 disk, 32 gigs of RAM) > Reporter: Gopal V > Assignee: Gopal V > Priority: Minor > Labels: optimization > Fix For: 0.12.0 > > Attachments: HIVE-4423.patch > > > RCFile::sync(long) takes approx ~1 second everytime it gets called because of > the inner loops in the function. > From what was observed with HDFS-4710, single byte reads are an order of > magnitude slower than larger 512 byte buffer reads. > Even when disk I/O is buffered to this size, there is overhead due to the > synchronized read() methods in BlockReaderLocal & RemoteBlockReader classes. > Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte) > call will speed this function >10x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira