Thanks But this does not work on streaming cube.
I read some code and found that in class *StreamingCubeBuilder,* the dictionary map was built by *DictionaryGenerator.buildDictionary()* instead of *DictionaryManager.buildDictionary()*. Does this mean that streaming cube does not support global dictionary? I add USERID to the dimensions, then the cube was built successfully. But I think the result will be incorrect if I calculate count distinct in different segments. Is that right Tony On Sat, Sep 24, 2016 at 10:29 PM, ShaoFeng Shi <[email protected]> wrote: > Hi Tony, > > The error was occurred when building a bitmap counter (for distinct > count); from your cube descriptor, it seems there is no global dictionary > be specified for the user id column. Please check this blog: > https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/ > > 2016-09-22 10:49 GMT+08:00 Tony Lee <[email protected]>: > >> Thanks, ShaoFeng Shi. That is the reason. >> >> But unfortunately, I have a new problem about count distinct (precisely) >> >> I added a streaming table on version 1.5.4 with my own json, which is >> like this >> { >> "logTimestamp":1474456891127, >> "datetime":"2016-09-21 19:21:31", >> "uploadTime":"20160921192023", >> "userId":"f2d28cbf9e21340a49e97063486db1f5", >> "accountId":"84108490", >> "otherfield":"...." >> } >> >> *The error message while building the cube is* >> >> 2016-09-22 10:01:40,731 ERROR [main StreamingCLI:103]: error start >> streaming >> java.lang.RuntimeException: error build cube from StreamingBatch >> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder. >> build(StreamingCubeBuilder.java:105) >> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1. >> run(OneOffStreamingBuilder.java:79) >> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO >> ffCubeStreaming(StreamingCLI.java:123) >> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main( >> StreamingCLI.java:97) >> Caused by: java.lang.NullPointerException >> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf( >> BitmapMeasureType.java:100) >> at org.apache.kylin.measure.bitmap.BitmapMeasureType$1.valueOf( >> BitmapMeasureType.java:89) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve >> rter.buildValueOf(InMemCubeBuilderInputConverter.java:122) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve >> rter.buildValue(InMemCubeBuilderInputConverter.java:94) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilderInputConve >> rter.convert(InMemCubeBuilderInputConverter.java:70) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv >> erter$1.next(InMemCubeBuilder.java:542) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder$InputConv >> erter$1.next(InMemCubeBuilder.java:523) >> at org.apache.kylin.gridtable.GTAggregateScanner.iterator(GTAgg >> regateScanner.java:139) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.createBas >> eCuboid(InMemCubeBuilder.java:339) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build( >> InMemCubeBuilder.java:166) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build( >> InMemCubeBuilder.java:135) >> at org.apache.kylin.cube.inmemcubing.InMemCubeBuilder.build( >> InMemCubeBuilder.java:122) >> at org.apache.kylin.cube.inmemcubing.AbstractInMemCubeBuilder$ >> 1.run(AbstractInMemCubeBuilder.java:80) >> at java.util.concurrent.Executors$RunnableAdapter.call( >> Executors.java:471) >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1145) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> >> *and the cube json is* >> { >> "uuid": "db91bcea-b33f-48af-a2f5-6014b14031f4", >> "last_modified": 1474511879506, >> "version": "1.5.4", >> "name": "hot_play_c", >> "model_name": "hot_play_cube", >> "description": "", >> "null_string": null, >> "dimensions": [ >> { >> "name": "DEFAULT.HOT_PLAY.HOUR_START", >> "table": "DEFAULT.HOT_PLAY", >> "column": "HOUR_START", >> "derived": null >> }, >> { >> "name": "DEFAULT.HOT_PLAY.MINUTE_START", >> "table": "DEFAULT.HOT_PLAY", >> "column": "MINUTE_START", >> "derived": null >> } >> ], >> "measures": [ >> { >> "name": "_COUNT_", >> "function": { >> "expression": "COUNT", >> "parameter": { >> "type": "constant", >> "value": "1", >> "next_parameter": null >> }, >> "returntype": "bigint" >> }, >> "dependent_measure_ref": null >> }, >> { >> "name": "COUNT_DISTINCT_USER", >> "function": { >> "expression": "COUNT_DISTINCT", >> "parameter": { >> "type": "column", >> "value": "USERID", >> "next_parameter": null >> }, >> "returntype": "bitmap" >> }, >> "dependent_measure_ref": null >> } >> ], >> "dictionaries": [], >> "rowkey": { >> "rowkey_columns": [ >> { >> "column": "HOUR_START", >> "encoding": "time", >> "isShardBy": false >> }, >> { >> "column": "MINUTE_START", >> "encoding": "time", >> "isShardBy": false >> } >> ] >> }, >> "hbase_mapping": { >> "column_family": [ >> { >> "name": "F1", >> "columns": [ >> { >> "qualifier": "M", >> "measure_refs": [ >> "_COUNT_" >> ] >> } >> ] >> }, >> { >> "name": "F2", >> "columns": [ >> { >> "qualifier": "M", >> "measure_refs": [ >> "COUNT_DISTINCT_USER" >> ] >> } >> ] >> } >> ] >> }, >> "aggregation_groups": [ >> { >> "includes": [ >> "HOUR_START", >> "MINUTE_START" >> ], >> "select_rule": { >> "hierarchy_dims": [], >> "mandatory_dims": [], >> "joint_dims": [] >> } >> } >> ], >> "signature": "QXddyWCVVCYQcozxd4Zh2w==", >> "notify_list": [], >> "status_need_notify": [ >> "ERROR", >> "DISCARDED", >> "SUCCEED" >> ], >> "partition_date_start": 0, >> "partition_date_end": 3153600000000, >> "auto_merge_time_ranges": [ >> 604800000, >> 2419200000 >> ], >> "retention_range": 0, >> "engine_type": 2, >> "storage_type": 2, >> "override_kylin_properties": {} >> } >> >> *no error after i change the returntype to hllc(16)* >> >> *i have struggled for several days. Any hints about this?* >> >> On Wed, Sep 21, 2016 at 10:47 PM, ShaoFeng Shi <[email protected]> >> wrote: >> >>> Hi Tony, >>> >>> It seems your cube isn't partitioned (no partition date column >>> specified); please check or provide the cube JSON. >>> >>> 2016-09-21 0:30 GMT+08:00 Alberto Ramón <[email protected]>: >>> >>>> I don't know but , can you check this change?: KYLIN-1744 >>>> <https://issues.apache.org/jira/browse/KYLIN-1744> in V1.3 >>>> >>>> >>>> 2016-09-20 14:50 GMT+02:00 Tony Lee <[email protected]>: >>>> >>>>> Hi, >>>>> >>>>> I was building cube from stream as the document(http://kylin.apache.o >>>>> rg/docs15/tutorial/cube_streaming.html >>>>> >>>>> ) says. >>>>> >>>>> I was using 1.5.3, and i encounter this error. Same error on 1.5.4. >>>>> Everything fine on 1.5.2.1. >>>>> >>>>> Any idea how to solve this? >>>>> >>>>> >>>>> 2016-09-20 20:31:51,520 INFO [main KafkaStreamingInput:129]: finish >>>>> to get streaming batch, total message count:30 >>>>> 2016-09-20 20:31:51,532 DEBUG [main CubeManager:855]: Reloaded new >>>>> cube: STREAMING_CUBE with reference beingCUBE[name=STREAMING_CUBE] having >>>>> 1 >>>>> segments:KYLIN_2822I1W3CX >>>>> 2016-09-20 20:31:51,536 INFO [main CubeManager:314]: Updating cube >>>>> instance 'STREAMING_CUBE' >>>>> 2016-09-20 20:31:51,538 WARN [main StreamingCLI:127]: invalid >>>>> args:streaming start STREAMING_CUBE 1474374540000_1474374600000 -start >>>>> 1474374540000 -end 1474374600000 -cube STREAMING_CUBE >>>>> 2016-09-20 20:31:51,539 ERROR [main StreamingCLI:103]: error start >>>>> streaming >>>>> java.lang.IllegalStateException: Segments overlap: >>>>> STREAMING_CUBE[FULL_BUILD] and STREAMING_CUBE[FULL_BUILD] >>>>> at org.apache.kylin.cube.CubeValidator.validate(CubeValidator.java:85) >>>>> at org.apache.kylin.cube.CubeManager.updateCubeWithRetry(CubeMa >>>>> nager.java:358) >>>>> at org.apache.kylin.cube.CubeManager.updateCube(CubeManager.java:301) >>>>> at org.apache.kylin.cube.CubeManager.appendSegment(CubeManager. >>>>> java:441) >>>>> at org.apache.kylin.engine.streaming.cube.StreamingCubeBuilder. >>>>> createBuildable(StreamingCubeBuilder.java:118) >>>>> at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.r >>>>> un(OneOffStreamingBuilder.java:76) >>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneO >>>>> ffCubeStreaming(StreamingCLI.java:123) >>>>> at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(Stre >>>>> amingCLI.java:97) >>>>> 2016-09-20 20:31:51,543 INFO [Thread-0 >>>>> ConnectionManager$HConnectionImplementation:1678]: >>>>> Closing zookeeper sessionid=0x35708fbc2740013 >>>>> 2016-09-20 20:31:51,549 INFO [Thread-0 ZooKeeper:684]: Session: >>>>> 0x35708fbc2740013 closed >>>>> 2016-09-20 20:31:51,549 INFO [main-EventThread ClientCnxn:512]: >>>>> EventThread shut down >>>>> >>>>> >>>> >>> >>> >>> -- >>> Best regards, >>> >>> Shaofeng Shi 史少锋 >>> >>> >> > > > -- > Best regards, > > Shaofeng Shi 史少锋 > >
