Hi zhixin,
   As I remember  If you set "shard by" column in cube design page, Kylin will 
use this column as the condition of  "distribute by", rather than the first 
three field of rowkey.




------------------ ???????? ------------------
??????: "liuzhixin"<liuz...@163.com>;
????????: 2018??11??2??(??????) ????3:11
??????: "dev"<dev@kylin.apache.org>;
????: "Chao Long"<wayn...@qq.com>; 
????: Re: Redistribute intermediate table default not by rand()



Hi Chao Long??

Thank you for the answer.
#
Step1: Create Intermediate Flat Hive Table
Step2: Redistribute intermediate table
#
Perhaps, Kylin can insert one rand column in the intermediate hive table  for 
the next shard, (as default).
At the same time,  Kylin should support the custom column for shard. (has 
provided)

Best Wishes.

> ?? 2018??11??2????????1:38??Chao Long <wayn...@qq.com> ??????
> 
> Hi zhixin,
> Data may become not correct if use "distribute by rand()".
> https://issues.apache.org/jira/browse/KYLIN-3388
> 
> 
> 
> 
> ------------------ ???????? ------------------
> ??????: "liuzhixin"<liuz...@163.com>;
> ????????: 2018??11??2??(??????) ????12:53
> ??????: "dev"<dev@kylin.apache.org>;
> ????: "ShaoFeng Shi"<shaofeng...@apache.org>; 
> ????: Re: Redistribute intermediate table default not by rand()
> 
> 
> 
> Hi kylin team:
> 
> Step: Redistribute intermediate table
> #
> ??????????????????????????????DISTRIBUTE BY????????????????DISTRIBUTE BY 
> RAND()
> ????????????????????????????????????????????????????????????????????
> 
> Best Regards??
> 
>> ?? 2018??11??2????????12:03??liuzhixin <liuz...@163.com> ??????
>> 
>> Hi kylin team:
>> 
>> Version: Kylin2.5-hadoop3.1 for hdp3.0
>> #
>> Step: Redistribute intermediate table
>> #
>> DISTRIBUTE BY is that:
>> INSERT OVERWRITE TABLE table_intermediate SELECT * FROM table_intermediate 
>> DISTRIBUTE BY Field1, Field2, Field3;
>> #
>> Not DISTRIBUTE BY RAND()
>> #
>> Is this default DISTRIBUTE BY Field1, Field2, Field3? how to DISTRIBUTE BY 
>> RAND()?
>> 
>> Best wishes.

Reply via email to