[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661151#comment-13661151 ]
Owen O'Malley commented on HIVE-4324: ------------------------------------- We should get this committed. Kevin, we can pull the incremental work out to a separate jira. One remaining concern is that we should provide a compatible OrcFile.createWriter method so that code doesn't break when users upgrade from 0.11 to 0.12. > ORC Turn off dictionary encoding when number of distinct keys is greater than > threshold > --------------------------------------------------------------------------------------- > > Key: HIVE-4324 > URL: https://issues.apache.org/jira/browse/HIVE-4324 > Project: Hive > Issue Type: Sub-task > Components: File Formats > Affects Versions: 0.11.0 > Reporter: Kevin Wilfong > Assignee: Kevin Wilfong > Attachments: HIVE-4324.1.patch.txt > > > Add a configurable threshold so that if the number of distinct values in a > string column is greater than that fraction of non-null values, dictionary > encoding is turned off. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira