Right, ultra high cardinality is not suitable for dictionary. Please consider other encodings.
On Thu, Sep 29, 2016 at 9:29 AM, Ashika Umanga Umagiliya < [email protected]> wrote: > I think I found some explanation here : > > https://github.com/KylinOLAP/Kylin/issues/364 > > On Thu, Sep 29, 2016 at 9:55 AM, Ashika Umanga Umagiliya < > [email protected]> wrote: > >> Finally, the 4th step failed without throwing OOM exception. >> The log error was : >> >> >> ------- >> >> java.lang.RuntimeException: Failed to create dictionary on >> RAT_LOG_FILTERED.RAT_LOG_APRL_MAY_2015.EASY_ID >> at >> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325) >> at >> org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:185) >> at >> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) >> at >> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) >> at >> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) >> at >> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) >> at >> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) >> at >> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) >> at >> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) >> at >> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.IllegalArgumentException: Too high cardinality is not >> suitable for dictionary -- cardinality: 96111330 >> at >> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:96) >> at >> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:73) >> at >> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:321) >> ... 14 more >> >> result code:2 >> >> >> On Thu, Sep 29, 2016 at 9:15 AM, Ashika Umanga Umagiliya < >> [email protected]> wrote: >> >>> Thanks for the tips, >>> >>> I increased memory up to 28Gb (32Gb total in the Kylin node) >>> But still I could see the java process (its the only java process in the >>> server) memory consumption keep growing and finally crash with >>> OutOfMemoryException. >>> >>> This happens in the 4th step "4 Step Name #: Build Dimension Dictionary >>> Duration: 0 Seconds" which continue for about 25mins before the crash. >>> Why does this step need that much of memory in Kylin side? >>> Also I couldn't see any logs to investigate the issue. >>> Apart from GC dump, where else can I find any useful information ? >>> >>> >>> On Wed, Sep 28, 2016 at 4:55 PM, Li Yang <[email protected]> wrote: >>> >>>> Increase memory in $KYLIN_HOME/bin/setenv.sh >>>> >>>> # (if your're deploying KYLIN on a powerful server and want to replace >>>> the default conservative settings) >>>> # uncomment following to for it to take effect >>>> export KYLIN_JVM_SETTINGS=... >>>> # export KYLIN_JVM_SETTINGS=... >>>> >>>> The commented line is a reference. >>>> >>>> Cheers >>>> Yang >>>> >>>> >>>> On Wed, Sep 28, 2016 at 3:06 PM, Ashika Umanga Umagiliya < >>>> [email protected]> wrote: >>>> >>>>> Looks like tomcat crashed after running out of memory. >>>>> I saw this in "kylin.out" : >>>>> >>>>> # >>>>> # java.lang.OutOfMemoryError: Java heap space >>>>> # -XX:OnOutOfMemoryError="kill -9 %p" >>>>> # Executing /bin/sh -c "kill -9 12727"... >>>>> >>>>> >>>>> >>>>> Before the crash , "kylin.log" file shows following lines. >>>>> Seems it keep trying to reconnect to ZooKeeper. >>>>> What the reason for Kylin to communicate with ZK ? >>>>> >>>>> I see the line "System free memory less than 100 MB." >>>>> >>>>> ---- kylin.log ---- >>>>> >>>>> 2016-09-28 06:50:02,495 ERROR [Curator-Framework-0] >>>>> curator.ConnectionState:200 : Connection timed out for connection string >>>>> (hdp-jz5001.hadoop.local:2181,hdp-jz5002.hadoop.local:2181,hdp-jz5003.hadoop.local:2181) >>>>> and timeout (15000) / elapsed (28428) >>>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >>>>> ConnectionLoss >>>>> at org.apache.curator.ConnectionState.checkTimeouts(ConnectionS >>>>> tate.java:197) >>>>> at org.apache.curator.ConnectionState.getZooKeeper(ConnectionSt >>>>> ate.java:87) >>>>> at org.apache.curator.CuratorZookeeperClient.getZooKeeper(Curat >>>>> orZookeeperClient.java:115) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo >>>>> rmBackgroundOperation(CuratorFrameworkImpl.java:806) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg >>>>> roundOperationsLoop(CuratorFrameworkImpl.java:792) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces >>>>> s$300(CuratorFrameworkImpl.java:62) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal >>>>> l(CuratorFrameworkImpl.java:257) >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>>>> Executor.java:1142) >>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>>>> lExecutor.java:617) >>>>> at java.lang.Thread.run(Thread.java:745) >>>>> 2016-09-28 06:50:02,495 INFO >>>>> [Thread-10-SendThread(hdp-jz5001.hadoop.local:2181)] >>>>> zookeeper.ClientCnxn:1279 : Session establishment complete on server >>>>> hdp-jz5001.hadoop.local/100.78.7.155:2181, sessionid = >>>>> 0x156d401adb1701a, negotiated timeout = 40000 >>>>> 2016-09-28 06:50:02,495 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5003.hadoop.local:2181)] zookeeper.ClientCnxn:1019 : >>>>> Opening socket connection to server hdp-jz5003.hadoop.local/100.78 >>>>> .8.153:2181. Will not attempt to authenticate using SASL (unknown >>>>> error) >>>>> 2016-09-28 06:50:02,495 ERROR [Curator-Framework-0] >>>>> curator.ConnectionState:200 : Connection timed out for connection string >>>>> (hdp-jz5001.hadoop.local:2181,hdp-jz5002.hadoop.local:2181,hdp-jz5003.hadoop.local:2181) >>>>> and timeout (15000) / elapsed (28429) >>>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >>>>> ConnectionLoss >>>>> at org.apache.curator.ConnectionState.checkTimeouts(ConnectionS >>>>> tate.java:197) >>>>> at org.apache.curator.ConnectionState.getZooKeeper(ConnectionSt >>>>> ate.java:87) >>>>> at org.apache.curator.CuratorZookeeperClient.getZooKeeper(Curat >>>>> orZookeeperClient.java:115) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo >>>>> rmBackgroundOperation(CuratorFrameworkImpl.java:806) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.doSyn >>>>> cForSuspendedConnection(CuratorFrameworkImpl.java:681) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces >>>>> s$700(CuratorFrameworkImpl.java:62) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$7.ret >>>>> riesExhausted(CuratorFrameworkImpl.java:677) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.check >>>>> BackgroundRetry(CuratorFrameworkImpl.java:696) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo >>>>> rmBackgroundOperation(CuratorFrameworkImpl.java:826) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg >>>>> roundOperationsLoop(CuratorFrameworkImpl.java:792) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces >>>>> s$300(CuratorFrameworkImpl.java:62) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal >>>>> l(CuratorFrameworkImpl.java:257) >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>>>> Executor.java:1142) >>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>>>> lExecutor.java:617) >>>>> at java.lang.Thread.run(Thread.java:745) >>>>> 2016-09-28 06:50:02,495 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5003.hadoop.local:2181)] zookeeper.ClientCnxn:864 : Socket >>>>> connection established to hdp-jz5003.hadoop.local/100.78.8.153:2181, >>>>> initiating session >>>>> 2016-09-28 06:50:15,060 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5003.hadoop.local:2181)] zookeeper.ClientCnxn:1140 : >>>>> Client session timed out, have not heard from server in 12565ms for >>>>> sessionid 0x356d401ac017143, closing socket connection and attempting >>>>> reconnect >>>>> 2016-09-28 06:50:02,495 INFO [Thread-10-EventThread] >>>>> state.ConnectionStateManager:228 : State change: RECONNECTED >>>>> 2016-09-28 06:50:31,040 INFO >>>>> [Thread-10-SendThread(hdp-jz5001.hadoop.local:2181)] >>>>> zookeeper.ClientCnxn:1140 : Client session timed out, have not heard from >>>>> server in 28544ms for sessionid 0x156d401adb1701a, closing socket >>>>> connection and attempting reconnect >>>>> 2016-09-28 06:50:31,042 DEBUG [http-bio-7070-exec-7] >>>>> service.AdminService:89 : Get Kylin Runtime Config >>>>> 2016-09-28 06:50:31,043 DEBUG [http-bio-7070-exec-1] >>>>> controller.UserController:64 : authentication.getPrincipal() is >>>>> org.springframework.security.core.userdetails.User@3b40b2f: Username: >>>>> ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; >>>>> credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: >>>>> ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER >>>>> 2016-09-28 06:50:43,799 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5002.hadoop.local:2181)] zookeeper.ClientCnxn:1019 : >>>>> Opening socket connection to server hdp-jz5002.hadoop.local/100.78 >>>>> .8.20:2181. Will not attempt to authenticate using SASL (unknown >>>>> error) >>>>> 2016-09-28 06:50:43,799 INFO [Thread-10-EventThread] >>>>> state.ConnectionStateManager:228 : State change: SUSPENDED >>>>> 2016-09-28 06:50:59,925 INFO [BadQueryDetector] >>>>> service.BadQueryDetector:151 : System free memory less than 100 MB. 0 >>>>> queries running. >>>>> 2016-09-28 06:50:59,926 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5002.hadoop.local:2181)] zookeeper.ClientCnxn:864 : Socket >>>>> connection established to hdp-jz5002.hadoop.local/100.78.8.20:2181, >>>>> initiating session >>>>> 2016-09-28 06:51:28,723 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5002.hadoop.local:2181)] zookeeper.ClientCnxn:1140 : >>>>> Client session timed out, have not heard from server in 28798ms for >>>>> sessionid 0x356d401ac017143, closing socket connection and attempting >>>>> reconnect >>>>> 2016-09-28 06:51:41,129 INFO >>>>> [pool-8-thread-10-SendThread(hdp-jz5001.hadoop.local:2181)] >>>>> zookeeper.ClientCnxn:1142 : Unable to read additional data from server >>>>> sessionid 0x356d401ac01714a, likely server has closed socket, closing >>>>> socket connection and attempting reconnect >>>>> 2016-09-28 06:51:53,474 INFO >>>>> [Thread-10-SendThread(hdp-jz5003.hadoop.local:2181)] >>>>> zookeeper.ClientCnxn:1019 : Opening socket connection to server >>>>> hdp-jz5003.hadoop.local/100.78.8.153:2181. Will not attempt to >>>>> authenticate using SASL (unknown error) >>>>> 2016-09-28 06:51:12,316 INFO >>>>> [pool-8-thread-10-SendThread(hdp-jz5003.hadoop.local:2181)] >>>>> zookeeper.ClientCnxn:1140 : Client session timed out, have not heard from >>>>> server in 28517ms for sessionid 0x256d401adbf6f77, closing socket >>>>> connection and attempting reconnect >>>>> 2016-09-28 06:54:29,304 INFO [localhost-startStop-1-SendTh >>>>> read(hdp-jz5001.hadoop.local:2181)] zookeeper.ClientCnxn:1019 : >>>>> Opening socket connection to server hdp-jz5001.hadoop.local/100.78 >>>>> .7.155:2181. Will not attempt to authenticate using SASL (unknown >>>>> error) >>>>> 2016-09-28 06:52:05,570 INFO [BadQueryDetector] >>>>> service.BadQueryDetector:151 : System free memory less than 100 MB. 0 >>>>> queries running. >>>>> 2016-09-28 06:56:29,665 ERROR [Curator-Framework-0] >>>>> imps.CuratorFrameworkImpl:537 : Background operation retry gave up >>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>>>> KeeperErrorCode = ConnectionLoss >>>>> at org.apache.zookeeper.KeeperException.create(KeeperException. >>>>> java:99) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.check >>>>> BackgroundRetry(CuratorFrameworkImpl.java:708) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo >>>>> rmBackgroundOperation(CuratorFrameworkImpl.java:826) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg >>>>> roundOperationsLoop(CuratorFrameworkImpl.java:792) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces >>>>> s$300(CuratorFrameworkImpl.java:62) >>>>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal >>>>> l(CuratorFrameworkImpl.java:257) >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>>>> Executor.java:1142) >>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>>>> lExecutor.java:617) >>>>> at java.lang.Thread.run(Thread.java:745) >>>>> 2016-09-28 06:57:31,275 INFO [BadQueryDetector] >>>>> service.BadQueryDetector:151 : System free memory less than 100 MB. 0 >>>>> queries running. >>>>> 2016-09-28 06:56:29,665 INFO >>>>> [pool-8-thread-10-SendThread(hdp-jz5001.hadoop.local:2181)] >>>>> zookeeper.ClientCnxn:1019 : Opening socket connection to server >>>>> hdp-jz5001.hadoop.local/100.78.7.155:2181. Will not attempt to >>>>> authenticate using SASL (unknown error) >>>>> >>>>> >>>>> >>>>> # >>>>> # java.lang.OutOfMemoryError: Java heap space >>>>> # -XX:OnOutOfMemoryError="kill -9 %p" >>>>> # Executing /bin/sh -c "kill -9 12727"... >>>>> >>>>> >>>> >>> >>> >>> -- >>> Umanga >>> http://jp.linkedin.com/in/umanga >>> http://umanga.ifreepages.com >>> >> >> >> >> -- >> Umanga >> http://jp.linkedin.com/in/umanga >> http://umanga.ifreepages.com >> > > > > -- > Umanga > http://jp.linkedin.com/in/umanga > http://umanga.ifreepages.com >
