I've found out that sharding is done manually, so running split in hbase shell breaks data.
So the main problem is that region-cut doesn't work on hbase with s3. I see that in the log it creates shards properly: 2017-08-05 20:54:48,709 INFO [Job 1175d3ed-504f-4eb0-a973-d57338fdff2c-892] steps.CreateHTableJob:192 : Total size 21334.075368547456M (estimated) 2017-08-05 20:54:48,709 INFO [Job 1175d3ed-504f-4eb0-a973-d57338fdff2c-892] steps.CreateHTableJob:193 : Expecting 4 regions. 2017-08-05 20:54:48,709 INFO [Job 1175d3ed-504f-4eb0-a973-d57338fdff2c-892] steps.CreateHTableJob:194 : Expecting 5333 MB per region. But then I get single 20GB region. Did anyone had same behaviour? On Sun, Aug 6, 2017 at 8:15 PM, Alexander Sterligov <[email protected]> wrote: > hi, > > I noticed very large hbase region for one segment (more than 20GB and > kylin.storage.hbase.region-cut-gb=5). I don't know why it is so large, > but anyway it degraded performance a lot, so I decided to split it in hbase. > > When the split has just started kylin started to return empty results for > queries to this segment. > > Why can that happen? > > PS > It seams to me that kylin.storage.hbase.region-cut-gb doesn't work in > case if external hbase cluster is used. >
