Hi Ketan, Kylin estimates the HBase table size; The estimation might be inaccurate when there are some advanced measures like TopN, Count distinct. The accuracy was improved in v2.5.0 by KYLIN-3453. For previous versions, you may need to manually give smaller value to these parameters:
kylin.cube.size-estimate-ratio=0.25 kylin.cube.size-estimate-memhungry-ratio=0.05 ketan dikshit <[email protected]> 于2018年11月5日周一 下午10:13写道: > Hi Team > I would like to understand how does the > 'kylin.storage.hbase.region-cut-gb’ property works. > We are currently using kylin 2.3.1, We are going with the default property > value ie; kylin.storage.hbase.region-cut-gb=5 > > But still we see some segments not adhering to this property; example: > > Segment: 20180723000000_20180730000000 > > Start Time: 2018-07-23 00:00:00 > End Time: 2018-07-30 00:00:00 > Source Count: 447860691 > HBase Table: KYLIN_ENX1MBQAMX > Region Count: 500 > Size: 49.57422 GB > Segment: 20181005000000_20181006000000 > > Start Time: 2018-10-05 00:00:00 > End Time: 2018-10-06 00:00:00 > Source Count: 52522716 > HBase Table: KYLIN_PG5PQBJ910 > Region Count: 47 > Size: 6.16309 GB > Segment: 20181010000000_20181011000000 > > Start Time: 2018-10-10 00:00:00 > End Time: 2018-10-11 00:00:00 > Source Count: 62012099 > HBase Table: KYLIN_I4QS9A4AHL > Region Count: 52 > Size: 6.98145 GB > > Along with the same, we are also using compression, > 'kylin.storage.hbase.compression-codec=lz4’ > The number of regions need to be kept in control, for our Hbase cluster to > be performant. > > Please share the understanding, how this property works, and what can be > the possible reasons why it is not working as intended. > > Thanks, > Ketan@Exponential > > -- Best regards, Shaofeng Shi 史少锋
