As Xiaoxiang mentioned, the code donation has a lot of improvements
compared to current Kylin 4. Many are long wanted, like

   - The flexible model can greatly improve the smoothness of adding new
   dimensions in a production environment.
   - The computed column can mind the gap of last-mile data transformation.
   - The new model metadata design that is more friendly to dynamic
   indexing.
   - Support of 63+ dimensions.

Accepting this code base a good thing for the whole Kylin community.

Cheers
Yang


On Tue, Feb 14, 2023 at 10:46 PM ShaoFeng Shi <shaofeng...@apache.org>
wrote:

> The current limitations are very difficult to solve in normal ways. For
> example, the Cuboid ID is represented by a Long number, which is 64 bit,
> and the sequence of each dimension is fixed. The Cuboid ID appears in every
> part of Kylin's source code. This design couldn't be refactored easily.  So
> I agree that a whole new design is necessary, in long term it can help a
> lot.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC,
> Apache Incubator PMC,
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Xiaoxiang Yu <x...@apache.org> 于2023年2月14日周二 14:22写道:
>
> > A formatted version of the discussion with the same content:
> >
> > ## Background ##
> >
> > As we discussed in the mailing list[2] last year, Kylin 4.0 has achieved
> > its goal in new storage (columnar file) and new query engine (Spark
> based),
> > and gained some adoptions from the community. But due to the old design
> > from the early versions, Kylin 4.0 still keep some limitations from
> > previous versions, such as max. 63 dimension cap, cube structure couldn't
> > be modified once built, etc. We think the only way to solve those
> > limitations is to do a whole redesign, especially in the metadata.
> >
> > The good news is, Kyligence has started to do that from years ago, and
> its
> > comercial version has been verified by many customers in terms of its
> > functionality, performance and stability. Last year, Kyligence open
> sourced
> > its core under Apache License v2.0, and signed CCLA to Apache Software
> > Foundataion. We staged it in a separate branch of the github repository
> for
> > review[1]. Engineers from other teams such as eBay also reviewed the
> > codebase, and put forward many new ideas. We think based on the codebase,
> > Kylin will not only gain a flexible metadata design, a faster computing
> > engine, but also will gain richer user scenarios.
> >
> > The new codebase has the following features compared with the latest
> > release (Kylin 4.0.3):
> >
> > - More flexible and enhanced data model
> >     * Allow adding new dimensions and measures to the existing data model
> >     * The model adapts to table schema changes while retaining the
> > existing index at the best effort
> >     * Support last-mile data transformation using Computed Column
> >     * Support raw query (non-aggregation query) using Table Index
> >     * Support changing dimension table (SCD2)
> > - Simplified metadata design
> >     * Merge DataModel and CubeDesc into new DataModel
> >     * Add DataFlow for more generic data sequence, e.g. streaming like
> > data flow
> >     * New metadata AuditLog for better cache synchronization
> > - More flexible index management
> >     * Add IndexPlan to support flexible index management
> >     * Add IndexEntity to support different index type
> >     * Add LayoutEntity to support different storage layouts of the same
> > Index
> > - Toward a native and vectorized query engine
> >     * Experiment: Integrate with a native execution engine, leveraging
> > Gluten
> >     * Support async query
> >     * Enhance cost-based index optimizer
> > - More
> >     * Build engine refactoring and performance optimization
> >     * New WEB UI based on Vue.js, a brand new front-end framework, to
> > replace AngularJS
> >     * Smooth modeling process on one canvas
> >
> >
> >
> >
> > ## Proposal ##
> > So, I'd like to propose adopting the new codebase from Kyligence as Kylin
> > 's future code base, e.g, Kylin 5. If accepted, we will request an IP
> > clearance in Apache Incubator for it as the next step.
> >
> >
> >
> >
> >
> > ## Reference ##
> > https://github.com/apache/kylin/tree/kylin5
> > https://lists.apache.org/thread/4fkhyw1fyf0jg5cb18v7vxyqbn6vm3zv
> >
> >
> > --
> >
> > Best wishes to you !
> > From :Xiaoxiang Yu
> >
> >
> >
> >
> >
> > At 2023-02-14 14:09:31, "Xiaoxiang Yu" <x...@apache.org> wrote:
> > >Background
> > >
> > >
> > >As we discussed in the mailing list[2] last year, Kylin 4.0 has achieved
> > its goal in new storage (columnar file) and new query engine (Spark
> based),
> > and gained some adoptions from the community. But due to the old design
> > from the early versions, Kylin 4.0 still keep some limitations from
> > previous versions, such as max. 63 dimension cap, cube structure couldn't
> > be modified once built, etc. We think the only way to solve those
> > limitations is to do a whole redesign, especially in the metadata.
> > >
> > >
> > >The good news is, Kyligence has started to do that from years ago, and
> > its comercial version has been verified by many customers in terms of its
> > functionality, performance and stability. Last year, Kyligence open
> sourced
> > its core under Apache License v2.0, and signed CCLA to Apache Software
> > Foundataion. We staged it in a separate branch of the github repository
> for
> > review[1]. Engineers from other teams such as eBay also reviewed the
> > codebase, and put forward many new ideas. We think based on the codebase,
> > Kylin will not only gain a flexible metadata design, a faster computing
> > engine, but also will gain richer user scenarios.
> > >
> > >
> > >The new codebase has the following features compared with the latest
> > release (Kylin 4.0.3):
> > >More flexible and enhanced data model
> > >Allow adding new dimensions and measures to the existing data model
> > >The model adapts to table schema changes while retaining the existing
> > index at the best effort
> > >Support last-mile data transformation using Computed Column
> > >Support raw query (non-aggregation query) using Table Index
> > >Support changing dimension table (SCD2)
> > >Simplified metadata design
> > >Merge DataModel and CubeDesc into new DataModel
> > >Add DataFlow for more generic data sequence, e.g. streaming like data
> flow
> > >New metadata AuditLog for better cache synchronization
> > >More flexible index management
> > >Add IndexPlan to support flexible index management
> > >Add IndexEntity to support different index type
> > >Add LayoutEntity to support different storage layouts of the same Index
> > >Toward a native and vectorized query engine
> > >Experiment: Integrate with a native execution engine, leveraging Gluten
> > >Support async query
> > >Enhance cost-based index optimizer
> > >More
> > >Build engine refactoring and performance optimization
> > >New WEB UI based on Vue.js, a brand new front-end framework, to replace
> > AngularJS
> > >Smooth modeling process on one canvas
> > >Proposal
> > >So, I'd like to propose adopting the new codebase from Kyligence as
> Kylin
> > 's future code base, e.g, Kylin 5. If accepted, we will request an IP
> > clearance in Apache Incubator for it as the next step.
> > >Reference
> > >https://github.com/apache/kylin/tree/kylin5
> > >https://lists.apache.org/thread/4fkhyw1fyf0jg5cb18v7vxyqbn6vm3zv
> > >https://kylin.apache.org/5.0/blog/introduction_of_metastore_cn
> > >
> > >--
> > >
> > >Best wishes to you !
> > >From :Xiaoxiang Yu
>

Reply via email to