ok, I'll give it a shot,on the other hand,is it possible to eliminate the step to create that flat table if the source table is the almost the same?
来自 魅族 PRO6 -------- 原始邮件 -------- 发件人:ShaoFeng Shi <[email protected]> 时间:周二 8月23日 21:56 收件人:user <[email protected]> 主题:Re: why distribute by partition column while creating flat hive table? In 1.5.3 Kylin will redistribute the source record by the "shard by" column (if user select such a column); the "shard by" is defined in the cube's "Advanced setting" page. Tthe "shard by" column should be a High Cardinality column; In your case, I guess you set the partition column's "shard by" = true by mistake; please set it to false, and then resubmit a build request; 2016-08-23 18:34 GMT+08:00 赵天烁 <[email protected]<mailto:[email protected]>>: I have a table with huge data increasment every day,bilion level.when I build a cube relate to that table,it stuck in creating flat hive table....for ever. I check the mr process and found that the task sql in this step is ended with "DISTRIBUTE BY ${partition date column}" I try to manually execute the same sql,but remove the " distribute by ", then everything goes fine with in 10 min. as far as I know this step of create a flat table is helpful when I have a star schema,but what I only have is that fact table. so why bother to create a table with the same structure even the data are the same?the only different is the table name.... so I think is it possible to just create a view with intermediate table name that kylin need when I havn't define any lookup table?this way will eliminate that long term task which seems like achieved nothing. ________________________________ 赵天烁 Kevin Zhao [email protected]<mailto:[email protected]> 珠海市魅族科技有限公司 MEIZU Technology Co., Ltd. 广东省珠海市科技创新海岸魅族科技楼 MEIZU Tech Bldg., Technology & Innovation Coast Zhuhai, 519085, Guangdong, China meizu.com<http://meizu.com> -- Best regards, Shaofeng Shi
