Max cardinality of defeault Dic is 2 millons Why encode Sale_ord_id as Dic? if this is an int, you can use integer Dic
Please check: http://apache-kylin.74782.x6.nabble.com/create-dictionary-error-td7155.html http://mail-archives.apache.org/mod_mbox/kylin-user/201702.mbox/%3CCAEcyM17BTkhVpFcZLP6%2Boawx%3D1eap%3DZS_ER1HJbhevJPBE71-g%40mail.gmail.com%3E 2017-02-14 10:14 GMT+01:00 仇同心 <[email protected]>: > Hi ,all > > The first step in cube merge, an error : > > > > java.lang.RuntimeException: Too big dictionary, dictionary cannot be > bigger than 2GB > > at org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes( > TrieDictionaryBuilder.java:421) > > at org.apache.kylin.dict.TrieDictionaryBuilder.build( > TrieDictionaryBuilder.java:408) > > at org.apache.kylin.dict.DictionaryGenerator$ > StringDictBuilder.build(DictionaryGenerator.java:165) > > at org.apache.kylin.dict.DictionaryGenerator.buildDictionary( > DictionaryGenerator.java:81) > > at org.apache.kylin.dict.DictionaryGenerator.buildDictionary( > DictionaryGenerator.java:73) > > at org.apache.kylin.dict.DictionaryGenerator.mergeDictionaries( > DictionaryGenerator.java:102) > > at org.apache.kylin.dict.DictionaryManager.mergeDictionary( > DictionaryManager.java:268) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > mergeDictionaries(MergeDictionaryStep.java:145) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > makeDictForNewSegment(MergeDictionaryStep.java:135) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > doWork(MergeDictionaryStep.java:67) > > at org.apache.kylin.job.execution.AbstractExecutable. > execute(AbstractExecutable.java:113) > > at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork( > DefaultChainedExecutable.java:57) > > at org.apache.kylin.job.execution.AbstractExecutable. > execute(AbstractExecutable.java:113) > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > JobRunner.run(DefaultScheduler.java:136) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > > > > “SALE_ORD_ID” Cardinality :157644463 > > SALE COUNT_DISTINCT Value:SALE_ORD_ID, Type:column bitmap > > > > I'm wondering that the high base fields can't do count_distinct accurate > statistical metrics ?? > > > > > > >
