[ https://issues.apache.org/jira/browse/HIVE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897099#comment-13897099 ]
Sushanth Sowmyan commented on HIVE-5504: ---------------------------------------- Testing related note : This bug is interesting in that hive as well as pig are able to read data irrespective of what compression format was actually used. i.e., the bug is that when we write to a compressed orc table while specifying a compression of SNAPPY,say, pig using HCat will write out the table using the default orc compression, which is ZLIB, irrespective of what the metadata indicates. This, however, is not a problem for hive in that the end data is still readable via hive and hcatalog/pig, so we don't get a read error. The read error occurs when external tools that are expecting the file to be snappy-compressed find that it is actually zlib compressed. It can also be a performance/size issue if snappy is desired over zlib, but we still retain zlib. Thus, testing by virtue of readability/non-readability or by way of checking for errors is not possible here. Instead, to test, end-to-end tests are the way to go here, and I've done the following for this: a) Create table using hive -e, specifying orc.compress=SNAPPY b) use pig -useHCatalog, and write to the aforesaid table. c) use hive --service orcfiledump on the file inside the table, it will show what compression format it sees. Without this patch, it indicates ZLIB, and with it, it indicates SNAPPY. In addition, no other previous tests fail (there are no regressions) > OrcOutputFormat honors compression properties only from within hive > --------------------------------------------------------------------- > > Key: HIVE-5504 > URL: https://issues.apache.org/jira/browse/HIVE-5504 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.11.0, 0.12.0 > Reporter: Venkat Ranganathan > Attachments: HIVE-5504.patch > > > When we import data into a HCatalog table created with the following storage > description > .. stored as orc tblproperties ("orc.compress"="SNAPPY") > the resultant orc file still uses the default zlib compression > It looks like HCatOutputFormat is ignoring the tblproperties specified. > show tblproperties shows that the table indeed has the properties properly > saved. > An insert/select into the table has the resulting orc file honor the tbl > property. -- This message was sent by Atlassian JIRA (v6.1.5#6160)