[ https://issues.apache.org/jira/browse/HIVE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790782#comment-13790782 ]
Sushanth Sowmyan commented on HIVE-5504: ---------------------------------------- Looking through code, this is not directly a HCat issue as much as it is an Orc issue. It looks like OrcOutputFormat only instantiates the appropriate compression override if instantiated from within getHiveRecordWriter as opposed to getRecordWriter, so this support would need to be added to orc to make it work from outside hive. {code} 100 public RecordWriter<NullWritable, OrcSerdeRow> 101 getRecordWriter(FileSystem fileSystem, JobConf conf, String name, 102 Progressable reporter) throws IOException { 103 return new 104 OrcRecordWriter(new Path(name), OrcFile.writerOptions(conf)); 105 } {code} versus: {code} 108 public FileSinkOperator.RecordWriter 109 getHiveRecordWriter(JobConf conf, 110 Path path, 111 Class<? extends Writable> valueClass, 112 boolean isCompressed, 113 Properties tableProperties, 114 Progressable reporter) throws IOException { 115 OrcFile.WriterOptions options = OrcFile.writerOptions(conf); 116 if (tableProperties.containsKey(OrcFile.STRIPE_SIZE)) { 117 options.stripeSize(Long.parseLong 118 (tableProperties.getProperty(OrcFile.STRIPE_SIZE))); 119 } 120 121 if (tableProperties.containsKey(OrcFile.COMPRESSION)) { 122 options.compress(CompressionKind.valueOf 123 (tableProperties.getProperty(OrcFile.COMPRESSION))); 124 } 125 126 if (tableProperties.containsKey(OrcFile.COMPRESSION_BLOCK_SIZE)) { 127 options.bufferSize(Integer.parseInt 128 (tableProperties.getProperty 129 (OrcFile.COMPRESSION_BLOCK_SIZE))); 130 } 131 132 if (tableProperties.containsKey(OrcFile.ROW_INDEX_STRIDE)) { 133 options.rowIndexStride(Integer.parseInt 134 (tableProperties.getProperty 135 (OrcFile.ROW_INDEX_STRIDE))); 136 } 137 138 if (tableProperties.containsKey(OrcFile.ENABLE_INDEXES)) { 139 if ("false".equals(tableProperties.getProperty 140 (OrcFile.ENABLE_INDEXES))) { 141 options.rowIndexStride(0); 142 } 143 } 144 145 if (tableProperties.containsKey(OrcFile.BLOCK_PADDING)) { 146 options.blockPadding(Boolean.parseBoolean 147 (tableProperties.getProperty 148 (OrcFile.BLOCK_PADDING))); 149 } 150 151 return new OrcRecordWriter(path, options); 152 } {code} The fix can be a simple (but not trivial) fix, as a change to OrcFile.writerOptions, to also take in overrides from conf. > HCatOutputFormat does not honor orc.compress tblproperty > -------------------------------------------------------- > > Key: HIVE-5504 > URL: https://issues.apache.org/jira/browse/HIVE-5504 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.11.0, 0.12.0 > Reporter: Venkat Ranganathan > > When we import data into a HCatalog table created with the following storage > description > .. stored as orc tblproperties ("orc.compress"="SNAPPY") > the resultant orc file still uses the default zlib compression > It looks like HCatOutputFormat is ignoring the tblproperties specified. > show tblproperties shows that the table indeed has the properties properly > saved. > An insert/select into the table has the resulting orc file honor the tbl > property. -- This message was sent by Atlassian JIRA (v6.1#6144)