[ https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095747#comment-13095747 ]
Vaibhav Aggarwal commented on HIVE-2266: ---------------------------------------- This patch attempts to fix a bug in the existing functionality in two ways: 1. In HiveFileFormatUtils.java, wrong jobconf is getting passed which is clear from the context. 2. In other cases the compression parameters are not getting set. The only difference this patch produces from the current behavior is smaller file sizes on file system. I am not sure how to write a hive query which can verify difference in file sizes. Do you have any ideas which can help me add some quick tests for this? The current test executes though the code checking that it does not result in any Exception or Error. It does not compare file size. > Really? Which platforms are you talking about? Can you tell me how to > reproduce this interesting behavior? Hadoop loads native compression libraries. I believe that they are platform dependent hence I do not assume that they always have same compression ratio. Please correct me if I am wrong here. In any case I think this is a broken existing functionality in Hive which we should fix. > Fix compression parameters > -------------------------- > > Key: HIVE-2266 > URL: https://issues.apache.org/jira/browse/HIVE-2266 > Project: Hive > Issue Type: Bug > Reporter: Vaibhav Aggarwal > Assignee: Vaibhav Aggarwal > Attachments: HIVE-2266-2.patch, HIVE-2266.patch > > > There are a number of places where compression values are not set correctly > in FileSinkOperator. This results in uncompressed files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira