Hi zhixin, As I remember If you set "shard by" column in cube design page, Kylin will use this column as the condition of "distribute by", rather than the first three field of rowkey.
------------------ ???????? ------------------ ??????: "liuzhixin"<liuz...@163.com>; ????????: 2018??11??2??(??????) ????3:11 ??????: "dev"<dev@kylin.apache.org>; ????: "Chao Long"<wayn...@qq.com>; ????: Re: Redistribute intermediate table default not by rand() Hi Chao Long?? Thank you for the answer. # Step1: Create Intermediate Flat Hive Table Step2: Redistribute intermediate table # Perhaps, Kylin can insert one rand column in the intermediate hive table for the next shard, (as default). At the same time, Kylin should support the custom column for shard. (has provided) Best Wishes. > ?? 2018??11??2????????1:38??Chao Long <wayn...@qq.com> ?????? > > Hi zhixin, > Data may become not correct if use "distribute by rand()". > https://issues.apache.org/jira/browse/KYLIN-3388 > > > > > ------------------ ???????? ------------------ > ??????: "liuzhixin"<liuz...@163.com>; > ????????: 2018??11??2??(??????) ????12:53 > ??????: "dev"<dev@kylin.apache.org>; > ????: "ShaoFeng Shi"<shaofeng...@apache.org>; > ????: Re: Redistribute intermediate table default not by rand() > > > > Hi kylin team: > > Step: Redistribute intermediate table > # > ??????????????????????????????DISTRIBUTE BY????????????????DISTRIBUTE BY > RAND() > ???????????????????????????????????????????????????????????????????? > > Best Regards?? > >> ?? 2018??11??2????????12:03??liuzhixin <liuz...@163.com> ?????? >> >> Hi kylin team: >> >> Version: Kylin2.5-hadoop3.1 for hdp3.0 >> # >> Step: Redistribute intermediate table >> # >> DISTRIBUTE BY is that: >> INSERT OVERWRITE TABLE table_intermediate SELECT * FROM table_intermediate >> DISTRIBUTE BY Field1, Field2, Field3; >> # >> Not DISTRIBUTE BY RAND() >> # >> Is this default DISTRIBUTE BY Field1, Field2, Field3? how to DISTRIBUTE BY >> RAND()? >> >> Best wishes.