Hi JiaTao:
    Maybe it's necessary that there is an optional auto-complete machanism 
among different measure's view, isn't it?


yuzhang 


| |
yuzhang
|
|
[email protected]
|
签名由网易邮箱大师定制
On 4/20/2019 11:38,JiaTao Tao<[email protected]> wrote:
Hi

The idea that supports Kylin adding measures dynamically is impressive.

But in my opinion, once you add a measure, the existing segments should
also calculate the new measure(just add a new measure column). Users can
have many cubes, a cube can have many segments, if measure's view is
different in each segment, it will increase the burden of the user.

--


Regards!

Aron Tao

yuzhang <[email protected]> 于2019年4月20日周六 上午1:43写道:

Hi dear kylin users and develop team:
Here have some things I want to discuss with community.
As a representative of MOLAP engine, kylin uses pre-aggregation strategies
to provide high-concurrency and second-level response analysis
capabilities, but also loses some flexibility.
The limitation that purge existing segment firstly to add an additional
measure will cause many double calculation and unnecessary disk IO. Such
waste should be avoid especially in MOLAP engine.
For example, there is an cubeA with one measure m1 and segments over time
range1(tr1). Now, user add one measure m2, but don't want to clear segments
over tr1. The value of m2 will exist in tr2, the segments build
subsequently. Sure, tr1 doesn't contain value of m2, which will be
understanded by user who know litte about MOLAP. Querying over tr1 and tr2
is valid for both m1 and m2, but the result of m2 over tr1 will be null.
It's will be better to reminder user the measure missing.Moreover,
refreshing will supply the m2 to segments over tr1.
Currently, kylin's storage engine uses HBase. The measure are aggregated
values based on combination of various dimension members and stored in a
column of a Column Family in HBase. For the same cube, adding a new measure
will add a column to the HBase table(mapping) and will take effect in the
next build. For the existing HTables(segments), the new column is allowed
to be missing. Refreshing old existing segments will add a new column in
their HTable to store new measure. Value of new measure is aggregated
according to the combination of dimension members in rowkey, without
recalculating existing measure.
Now, For additional measure and even additional dimensions, Kylin's
current solution is Hybrid, but we found the following shortcomings during
use:
1. Management costs: Repeated maintenance of similar Cubes, most of which
have many intersections of dimensions and indicators. If you want to
perform optimization operations such as pruning, you need to configure all
of these cubes.
2. A large number of cubes: The initial analysis of the business is not
stable, and analysts often have the need to increase some measures. The
cube is added continuously to the Hybrid group, which will produce a lot of
cubes.
3. Repeat calculation: If you want to drop the old cube in the Hybrid
group, you need to build the latest cube by compute historical data to
cover the old cube.
Those will result in a lot of waste.
In addition, I felt that the metadata about the measure was not perfect
during the applying of Kylin.
1. As one of the most important concerns of analysts, if the measures of
the analysis system can be decoupled from the materialized view(cube) and
have their own management system, it may be more flexibility.
2. Once the dimensions have been choose in cube designing, it's cuboids
are confirmed no matter the number of measures. It may make confuse to
maintenance cubes with different measures but same cuboids. Cubes with
different cuboids should be considered different cube, which is the
definition of cube, isn't it?
It's just some thinking about MOLAP during I using kylin. How do you think
about this? Looking forward your reply, sincerely.
Maybe here are some mistake or misunderstanding, please feel free to
correct me or discuss further more if you find any of them.
Best regards
yuzhang


yuzhang
[email protected]

<https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=yuzhang&uid=shifengdefannao%40163.com&iconUrl=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg&items=%5B%22shifengdefannao%40163.com%22%5D>
签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制

Reply via email to