The cube is pretty optimal in terms of dimensions. > and 7 COUNT_DISTINCT expressions This is the heaviest workload I see. You may confirm the CPU usage of MR job, if the bottleneck is CPU, then room of improvement is limited by the hardware.
On Mon, Nov 28, 2016 at 7:57 PM, 仇同心 <[email protected]> wrote: > Thanks! > > > > *发件人:* Dong Li [mailto:[email protected]] > *发送时间:* 2016年11月28日 18:01 > *收件人:* user > *主题:* Re: Cube optimization for help > > > > Hi Tongxin, > > > > Are these dimensions’ cardinality very very small? > > "PWDTYPE”, > > “IS_NEW_CARD”, > > "REQ_ORDER_YEWU_TYPE”, > > "ORDER_YEWU_TYPE”, > > > > If yes, try to put them in one joint aggr group. This may help to reduce > the expansion rate. > > > > Thanks, > > Dong Li > > > > Original Message > > *Sender:* 仇同心<[email protected]> > > *Recipient:* [email protected]<[email protected]>; > [email protected]<[email protected]> > > *Date:* Monday, Nov 28, 2016 12:52 > > *Subject:* Cube optimization for help > > > > Hi,all > > There is a cube optimization for help. > > Cuhe has 15 dimensions, including 14 normal dimensions and 1 derived > dimension, and the cardinality of all dimensions is not high;And this cube > also has 10 measures, including 1 count expression,2 sum expressions and 7 > COUNT_DISTINCT expressions, and the cardinality of all measures’s column is > more than one hundred million, the Return Type of 7 COUNT_DISTINCT > expressions is hllc(14). > > > > I founded that the step of “Build N - Dimension Cuboid Data” cost most of > the time, I hope you can provide me some optimization plan! > > > > *The info about the segment: Start Time: 2016-11-10 00:00:00 End Time: > 2016-11-24 00:00:00* > > > > *Source Records: 201,027,127* > > *Cube Size:* *1.73 TB* > > *Expansion Rate:14025.91%* > > *The Duration of this job was 1,961.62 mins* > > *The detailed build steps of job :* > > 2016-11-25 11:15:24 GMT+8 > > #1 Step Name: Create Intermediate Flat Hive Table > > Duration: 1.57 mins > > > > 2016-11-25 11:16:58 GMT+8 > > #2 Step Name: Redistribute Flat Hive Table > > Duration: 3.32 mins > > > > 2016-11-25 11:20:17 GMT+8 > > #3 Step Name: Extract Fact Table Distinct Columns > > Data Size: 8.57 MB > > Duration: 19.14 mins > > > > 2016-11-25 11:39:26 GMT+8 > > #4 Step Name: Build Dimension Dictionary > > Duration: 0.03 mins > > 2016-11-25 11:39:27 GMT+8 > > #5 Step Name: Save Cuboid Statistics > > Duration: 0.01 mins > > > > 2016-11-25 11:39:28 GMT+8 > > #6 Step Name: Create HTable > > Duration: 0.07 mins > > 2016-11-25 11:39:32 GMT+8 > > #7 Step Name: Build Base Cuboid Data > > Data Size: 2.53 GB > > Duration: 14.59 mins > > > > 2016-11-25 11:54:08 GMT+8 > > #8 Step Name: Build N-Dimension Cuboid Data : 13-Dimension > > Data Size: 2.31 GB > > Duration: 31.27 mins > > > > 2016-11-25 12:25:24 GMT+8 > > #9 Step Name: Build N-Dimension Cuboid Data : 12-Dimension > > Data Size: 17.81 GB > > Duration: 430.54 mins > > > > 2016-11-25 19:35:56 GMT+8 > > #10 Step Name: Build N-Dimension Cuboid Data : 11-Dimension > > Data Size: 62.58 GB > > Duration: 273.30 mins > > > > 2016-11-26 00:09:15 GMT+8 > > #11 Step Name: Build N-Dimension Cuboid Data : 10-Dimension > > Data Size: 135.99 GB > > Duration: 138.05 mins > > > > 2016-11-26 02:27:18 GMT+8 > > #12 Step Name: Build N-Dimension Cuboid Data : 9-Dimension > > Data Size: 211.91 GB > > Duration: 313.53 mins > > > > 2016-11-26 07:40:49 GMT+8 > > #13 Step Name: Build N-Dimension Cuboid Data : 8-Dimension > > Data Size: 252.22 GB > > Duration: 278.18 mins > > > > 2016-11-26 12:19:01 GMT+8 > > #14 Step Name: Build N-Dimension Cuboid Data : 7-Dimension > > Data Size: 234.18 GB > > Duration: 168.91 mins > > > > 2016-11-26 15:07:55 GMT+8 > > #15 Step Name: Build N-Dimension Cuboid Data : 6-Dimension > > Data Size: 169.02 GB > > Duration: 96.18 mins > > > > 2016-11-26 16:44:06 GMT+8 > > #16 Step Name: Build N-Dimension Cuboid Data : 5-Dimension > > Data Size: 92.81 GB > > Duration: 50.24 mins > > > > 2016-11-26 17:34:20 GMT+8 > > #17 Step Name: Build N-Dimension Cuboid Data : 4-Dimension > > Data Size: 36.45 GB > > Duration: 16.21 mins > > > > 2016-11-26 17:50:33 GMT+8 > > #18 Step Name: Build N-Dimension Cuboid Data : 3-Dimension > > Data Size: 8.91 GB > > Duration: 7.75 mins > > > > 2016-11-26 17:58:18 GMT+8 > > #19 Step Name: Build N-Dimension Cuboid Data : 2-Dimension > > Data Size: 1.00 GB > > Duration: 3.08 mins > > > > 2016-11-26 18:01:23 GMT+8 > > #20 Step Name: Build N-Dimension Cuboid Data : 1-Dimension > > Data Size: 0.12 KB > > Duration: 0.57 mins > > > > 2016-11-26 18:01:57 GMT+8 > > #21 Step Name: Build Cube > > Duration: 0.00 mins > > 2016-11-26 18:01:57 GMT+8 > > #22 Step Name: Convert Cuboid Data to HFile > > Data Size: 1.73 TB > > Duration: 114.57 mins > > > > 2016-11-26 19:56:31 GMT+8 > > #23 Step Name: Load HFile to HBase Table > > Duration: 0.38 mins > > 2016-11-26 19:56:54 GMT+8 > > #24 Step Name: Update Cube Info > > Duration: 0.00 mins > > > > 2016-11-26 19:56:54 GMT+8 > > #25 Step Name: Garbage Collection > > Duration: 0.12 mins > > > > > > *The following is the json of the cube:* > > > > { > > "uuid": "ab54fdc8-253d-430e-a193-4c17bbfd0872", > > "last_modified": 1480039756561, > > "version": "1.6.0", > > "name": "cube_pay_syt_order_det_analysis_new", > > "model_name": "pay_syt_order_det_analysis", > > "description": "", > > "null_string": null, > > "dimensions": [ > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.ETL_DT", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "ETL_DT", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.PWDTYPE", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "PWDTYPE", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.REAL_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "REAL_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.IS_NEW_CARD", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "IS_NEW_CARD", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.REQ_ORDER_YEWU_TYPE", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "REQ_ORDER_YEWU_TYPE", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.ORDER_YEWU_TYPE", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "ORDER_YEWU_TYPE", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.ITEM_FIRST_CATE_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "ITEM_FIRST_CATE_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.ITEM_SECOND_CATE_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "ITEM_SECOND_CATE_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.ITEM_THIRD_CATE_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "ITEM_THIRD_CATE_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.ITEM_BRAND_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "ITEM_BRAND_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.COUNTY_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "COUNTY_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.CITY_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "CITY_NAME", > > "derived": null > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.PROVINCE_NAME", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "PROVINCE_NAME", > > "derived": null > > }, > > { > > "name": "DIM.DIM_DAY_DERIVED", > > "table": "DIM.DIM_DAY", > > "column": null, > > "derived": [ > > "DIM_DAY_NAME", > > "DIM_DAY_TXDATE" > > ] > > }, > > { > > "name": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D.PAY_TYPE", > > "table": "DMT.DMT_KYLIN_PAY_SYT_ORDR_DET_I_D", > > "column": "PAY_TYPE", > > "derived": null > > } > > ], > > "measures": [ > > { > > "name": "_COUNT_", > > "function": { > > "expression": "COUNT", > > "parameter": { > > "type": "constant", > > "value": "1", > > "next_parameter": null > > }, > > "returntype": "bigint" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "应付金额", > > "function": { > > "expression": "SUM", > > "parameter": { > > "type": "column", > > "value": "AMOUNT", > > "next_parameter": null > > }, > > "returntype": "decimal(25,4)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "商品成功支付金额", > > "function": { > > "expression": "SUM", > > "parameter": { > > "type": "column", > > "value": "USER_ACTUAL_PAY_AMOUNT", > > "next_parameter": null > > }, > > "returntype": "decimal(25,4)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "商城APP请求单", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "SERIAL_NUM", > > "next_parameter": null > > }, > > "returntype": "hllc(14)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "请求订单数", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "REQ_OUTBIZNO", > > "next_parameter": null > > }, > > "returntype": "hllc(14)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "支付订单数", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "OUTBIZNO", > > "next_parameter": null > > }, > > "returntype": "hllc(14)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "成功支付订单数", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "SUC_OUTBIZNO", > > "next_parameter": null > > }, > > "returntype": "hllc(14)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "支付单数", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "PAYID", > > "next_parameter": null > > }, > > "returntype": "hllc(15)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "成功支付单数", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "SUC_PAYID", > > "next_parameter": null > > }, > > "returntype": "hllc(14)" > > }, > > "dependent_measure_ref": null > > }, > > { > > "name": "用户数", > > "function": { > > "expression": "COUNT_DISTINCT", > > "parameter": { > > "type": "column", > > "value": "PIN", > > "next_parameter": null > > }, > > "returntype": "hllc(14)" > > }, > > "dependent_measure_ref": null > > } > > ], > > "dictionaries": [], > > "rowkey": { > > "rowkey_columns": [ > > { > > "column": "ETL_DT", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "COUNTY_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "ITEM_BRAND_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "ITEM_THIRD_CATE_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "ITEM_SECOND_CATE_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "ITEM_FIRST_CATE_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "CITY_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "PROVINCE_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "REQ_ORDER_YEWU_TYPE", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "ORDER_YEWU_TYPE", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "PWDTYPE", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "IS_NEW_CARD", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "REAL_NAME", > > "encoding": "dict", > > "isShardBy": false > > }, > > { > > "column": "PAY_TYPE", > > "encoding": "dict", > > "isShardBy": false > > } > > ] > > }, > > "hbase_mapping": { > > "column_family": [ > > { > > "name": "F1", > > "columns": [ > > { > > "qualifier": "M", > > "measure_refs": [ > > "_COUNT_", > > "应付金额", > > "商品成功支付金额" > > ] > > } > > ] > > }, > > { > > "name": "F2", > > "columns": [ > > { > > "qualifier": "M", > > "measure_refs": [ > > "商城APP请求单", > > "请求订单数", > > "支付订单数", > > "成功支付订单数", > > "支付单数", > > "成功支付单数", > > "用户数" > > ] > > } > > ] > > } > > ] > > }, > > "aggregation_groups": [ > > { > > "includes": [ > > "ETL_DT", > > "PWDTYPE", > > "REAL_NAME", > > "IS_NEW_CARD", > > "REQ_ORDER_YEWU_TYPE", > > "ORDER_YEWU_TYPE", > > "ITEM_FIRST_CATE_NAME", > > "ITEM_SECOND_CATE_NAME", > > "ITEM_THIRD_CATE_NAME", > > "ITEM_BRAND_NAME", > > "COUNTY_NAME", > > "CITY_NAME", > > "PROVINCE_NAME" > > ], > > "select_rule": { > > "hierarchy_dims": [ > > [ > > "COUNTY_NAME", > > "CITY_NAME", > > "PROVINCE_NAME" > > ], > > [ > > "ITEM_FIRST_CATE_NAME", > > "ITEM_SECOND_CATE_NAME", > > "ITEM_THIRD_CATE_NAME" > > ] > > ], > > "mandatory_dims": [ > > "ETL_DT" > > ], > > "joint_dims": [] > > } > > } > > ], > > "signature": "Ukf/u2JnSO44TPQWDJFhyQ==", > > "notify_list": [], > > "status_need_notify": [ > > "ERROR", > > "DISCARDED", > > "SUCCEED" > > ], > > "partition_date_start": 1478736000000, > > "partition_date_end": 3153600000000, > > "auto_merge_time_ranges": [ > > 604800000, > > 2419200000 > > ], > > "retention_range": 0, > > "engine_type": 2, > > "storage_type": 2, > > "override_kylin_properties": {} > > } > > >
