Re: why distribute by partition column while creating flat hive table?

赵天烁 Tue, 23 Aug 2016 08:03:07 -0700

ok, I'll give it a shot,on the other hand,is it possible to eliminate the step 
to create that flat table if the source table is the almost the same?




来自 魅族 PRO6



-------- 原始邮件 --------
发件人：ShaoFeng Shi <[email protected]>
时间：周二 8月23日 21:56
收件人：user <[email protected]>
主题：Re: why distribute by partition column while creating flat hive table?

In 1.5.3 Kylin will redistribute the source record by the "shard by" column (if 
user select such a column); the "shard by" is defined in the cube's "Advanced 
setting" page. Tthe "shard by" column should be a High Cardinality column; In 
your case, I guess you set the partition column's "shard by" = true by mistake; 
please set it to false, and then resubmit a build request;

2016-08-23 18:34 GMT+08:00 赵天烁 
<[email protected]<mailto:[email protected]>>:
I have a table with huge data increasment every day,bilion level.when I build a 
cube relate to that table,it stuck in creating flat hive table....for ever.
I check the mr process and found that the task sql in this step is ended with 
"DISTRIBUTE BY  ${partition date column}"
I try to manually execute the same sql,but remove the " distribute by ", then 
everything goes fine with in 10 min.
as far as I know this step of create a flat table is helpful when I have a star 
schema,but what I only have is that fact table. so why bother to create a table 
with the same structure even the data are the same?the only different is the 
table name....
so I think is it possible to just create a view with intermediate table name 
that kylin need when I havn't define any lookup table?this way will eliminate 
that long term task which seems like achieved nothing.

________________________________
赵天烁
Kevin Zhao
[email protected]<mailto:[email protected]>

珠海市魅族科技有限公司
MEIZU Technology Co., Ltd.
广东省珠海市科技创新海岸魅族科技楼
MEIZU Tech Bldg., Technology & Innovation Coast
Zhuhai, 519085, Guangdong, China
meizu.com<http://meizu.com>



--
Best regards,

Shaofeng Shi

Re: why distribute by partition column while creating flat hive table?

Reply via email to