[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091873#comment-14091873 ]
Lefty Leverenz commented on HIVE-4324: -------------------------------------- This added configuration parameter *hive.exec.orc.dictionary.key.size.threshold* to HiveConf.java in 0.12.0. It's documented in the wiki here: * [Configuration Properties -- hive.exec.orc.dictionary.key.size.threshold | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.dictionary.key.size.threshold] > ORC Turn off dictionary encoding when number of distinct keys is greater than > threshold > --------------------------------------------------------------------------------------- > > Key: HIVE-4324 > URL: https://issues.apache.org/jira/browse/HIVE-4324 > Project: Hive > Issue Type: Sub-task > Components: File Formats > Affects Versions: 0.11.0 > Reporter: Kevin Wilfong > Assignee: Kevin Wilfong > Fix For: 0.12.0 > > Attachments: HIVE-4324.1.patch.txt, HIVE-4324.D12045.1.patch, > HIVE-4324.D12045.2.patch, HIVE-4324.D12045.2.patch, HIVE-4324.D12045.3.patch > > > Add a configurable threshold so that if the number of distinct values in a > string column is greater than that fraction of non-null values, dictionary > encoding is turned off. -- This message was sent by Atlassian JIRA (v6.2#6252)