Background

As we discussed in the mailing list[2] last year, Kylin 4.0 has achieved its 
goal in new storage (columnar file) and new query engine (Spark based), and 
gained some adoptions from the community. But due to the old design from the 
early versions, Kylin 4.0 still keep some limitations from previous versions, 
such as max. 63 dimension cap, cube structure couldn't be modified once built, 
etc. We think the only way to solve those limitations is to do a whole 
redesign, especially in the metadata.


The good news is, Kyligence has started to do that from years ago, and its 
comercial version has been verified by many customers in terms of its 
functionality, performance and stability. Last year, Kyligence open sourced its 
core under Apache License v2.0, and signed CCLA to Apache Software Foundataion. 
We staged it in a separate branch of the github repository for review[1]. 
Engineers from other teams such as eBay also reviewed the codebase, and put 
forward many new ideas. We think based on the codebase, Kylin will not only 
gain a flexible metadata design, a faster computing engine, but also will gain 
richer user scenarios.


The new codebase has the following features compared with the latest release 
(Kylin 4.0.3):
More flexible and enhanced data model
Allow adding new dimensions and measures to the existing data model
The model adapts to table schema changes while retaining the existing index at 
the best effort
Support last-mile data transformation using Computed Column
Support raw query (non-aggregation query) using Table Index
Support changing dimension table (SCD2)
Simplified metadata design
Merge DataModel and CubeDesc into new DataModel
Add DataFlow for more generic data sequence, e.g. streaming like data flow
New metadata AuditLog for better cache synchronization
More flexible index management
Add IndexPlan to support flexible index management
Add IndexEntity to support different index type
Add LayoutEntity to support different storage layouts of the same Index
Toward a native and vectorized query engine
Experiment: Integrate with a native execution engine, leveraging Gluten
Support async query
Enhance cost-based index optimizer
More
Build engine refactoring and performance optimization
New WEB UI based on Vue.js, a brand new front-end framework, to replace 
AngularJS
Smooth modeling process on one canvas
Proposal
So, I'd like to propose adopting the new codebase from Kyligence as Kylin 's 
future code base, e.g, Kylin 5. If accepted, we will request an IP clearance in 
Apache Incubator for it as the next step.
Reference
https://github.com/apache/kylin/tree/kylin5
https://lists.apache.org/thread/4fkhyw1fyf0jg5cb18v7vxyqbn6vm3zv
https://kylin.apache.org/5.0/blog/introduction_of_metastore_cn

--

Best wishes to you ! 
From :Xiaoxiang Yu

Reply via email to