Tong, could you please provide some detail information, like the Kylin/Hadoop version, model/cube description, etc. That would help us to analysis.
2017-01-03 19:59 GMT+08:00 Billy Liu <[email protected]>: > The default region.cut is 5, and default hfile.size.gb is 2. What's your > setting? > > 2017-01-03 19:33 GMT+08:00 Billy Liu <[email protected]>: > >> Thanks Da Tong for the careful code check. >> But actually, both BatchCubingJobBuilder and BatchCubingJobBuilder2 will >> call HBaseMRSteps.createCreateHTableStep, The CreateHTableJob step will >> calculate the regions by split parameter. >> >> 2017-01-03 16:25 GMT+08:00 Da Tong <[email protected]>: >> >>> Hi, >>> >>> We found that in Hadoop using mapred2 with yarn, the number of HFile >>> created by Kylin is always 1. After some investigation, we suspect that in >>> engine-mr, the BatchCubingJobBuilder2 works in a different way of >>> BatchCubingJobBuilder. BatchCubingJobBuilder will invoke >>> HBaseMRSteps.addSaveCuboidToHTableSteps, which include calculating >>> region size. But BatchCubingJobBuilder2 invoke >>> HBaseMRSteps.createConvertCuboidToHfileStep directly. >>> I am not sure that this difference is by design or not. But what we see >>> is that we got a single 16GB hfile in a single region even we set >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> kylin.hbase.region.cut and Kylie.hbase.hfile.size.gb. >>> >>> >>> >>> -- >>> TONG, Da / 佟达 >>> >> >> > -- Best regards, Shaofeng Shi 史少锋
