Obviously there are too many segments (24*3=72), try to merge them as Billy
suggested.

Secondly if possible try to review and optimize the cube design (especially
the rowkey sequence, put high-cardinality filter column to the begin
position to minimal the scan range), see
http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin

Thirdly try to give more power to the cluster, e.g use physical machines;
and also use multiple kylin query nodes to balance the concurrent work
load.

Just some cents, hope it can help.

2017-01-12 22:16 GMT+08:00 Billy Liu <[email protected]>:

> I have concerns with so many segments. Please try query only one cube with
> one segment first.
>
> 2017-01-12 13:36 GMT+08:00 Phong Pham <[email protected]>:
>
>> Hi,
>> Thank you so much for your help. I really appreciate it. Im really
>> impressed with your project and trying to apply it to our product. Our live
>> product is still working on Mysql and MongoDb, but data is growing fast.
>> That's why we need your product for the database engine replacement.
>> About our problem with many queries on same time on Apache Kylin, I'm
>> trying to monitor some elements on our system and review cubes. So are
>> there some tutorials about concurrency of Kylin or HBase?
>> I will give you more details abour our system:
>> Hardware:
>> 2 physical machines -> 7 vitural machines
>> Each vitural machine:
>> CPU: 8cores
>> RAM: 24GB
>> We are setup hadoop env with  hortonwork 2.5 and setup HBase with 5
>> RegionServer, 2 Hbase masters
>> Apahce Kylin we setup on 2 machines:
>> + Node 1: using for build cubes
>> + Node 2: using for only queries (this node also contain RegionServer)
>> Cube and Queries:
>> + Size of Cubes:
>>   - Cube 1: 20GB/14M rows - 24 segments (maybe we need to meger them into
>> 2-3 segments)
>>   - Cube 2: 460MB/3M rows - 24 segments
>>   - Cube 3: 1.3GB/1.4M rows - 24 segments
>> + We use one query to read data from 3 cubes and union all into 1 result
>> Test case:
>> + On single request: 3s
>> + On 5 requests on same times: (submit multi-requests from client):
>> 20s/request
>> And that is not acceptable when we go live.
>> So hope you all review our struture and give us some best pratices with
>> Kylin And Hbase.
>> Thanks
>>
>> 2017-01-12 8:24 GMT+07:00 ShaoFeng Shi <[email protected]>:
>>
>>> In this case you need do some profiling to see what's the bottleneck:
>>> Kylin or HBase or other factors like CPU, memory or network; maybe it is
>>> related with the cube design, try to optimize the cube design with the
>>> executed query is also a way; It is hard to give you good answer with a
>>> couple words.
>>>
>>> 2017-01-11 19:50 GMT+08:00 Phong Pham <[email protected]>:
>>>
>>>> Heres about detail on our system:
>>>>
>>>> Hbase: 5 nodes
>>>> Data size: 24M rows
>>>>
>>>> Query result:
>>>> *Success: true*
>>>> *Duration: 20s*
>>>> *Project: metrixa_global_database*
>>>> *Realization Names: [xxx, xxx, xxx]*
>>>> *Cuboid Ids: [45971, 24]*
>>>>
>>>>
>>>> 2017-01-11 18:34 GMT+07:00 Phong Pham <[email protected]>:
>>>>
>>>>> Hi all,
>>>>>     I have a problem with concurrency on Apache Kylin. Execute single
>>>>> query, it takes about 3s. Howerver,when i run multiple queries on the same
>>>>> time, each query take about 13-15s. So how can i solve problems?
>>>>> My Kylin Version is 1.6.1
>>>>> Thanks
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Reply via email to