[ https://issues.apache.org/jira/browse/HIVE-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643778#comment-13643778 ]
Hudson commented on HIVE-4423: ------------------------------ Integrated in Hive-trunk-hadoop2 #179 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/179/]) HIVE-4423 : Improve RCFile::sync(long) 10x (Gopal V via Ashutosh Chauhan) (Revision 1476648) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1476648 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java > Improve RCFile::sync(long) 10x > ------------------------------ > > Key: HIVE-4423 > URL: https://issues.apache.org/jira/browse/HIVE-4423 > Project: Hive > Issue Type: Improvement > Environment: Ubuntu LXC (1 SSD, 1 disk, 32 gigs of RAM) > Reporter: Gopal V > Assignee: Gopal V > Priority: Minor > Labels: optimization > Fix For: 0.12.0 > > Attachments: HIVE-4423.patch > > > RCFile::sync(long) takes approx ~1 second everytime it gets called because of > the inner loops in the function. > From what was observed with HDFS-4710, single byte reads are an order of > magnitude slower than larger 512 byte buffer reads. > Even when disk I/O is buffered to this size, there is overhead due to the > synchronized read() methods in BlockReaderLocal & RemoteBlockReader classes. > Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte) > call will speed this function >10x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira