Thank you Xiaoxiang, 1. For my question of near real time data: this scenario is not about querying the cube (index), I am mentioning the query against the Hive table only: is that possible to instantly querying the non_cube data if the data is already in Hive?
Best regards On Mon, Nov 13, 2023 at 4:23 PM Xiaoxiang Yu <x...@apache.org> wrote: > 1. Query them instantly is not possible, you need to trigger a build job > and wait it completed, it will cost about 5-30 mintues in most cases. So > the delay caused by Kylin is 5-30 minites. > > 2. DS/AI can send SQL query using Python and get the result(if kylinpy > works well), just like you do in Kylin insight window. > > > > > -- > *Best wishes to you ! * > *From :**Xiaoxiang Yu* > > > At 2023-11-13 17:09:59, "Nam Đỗ Duy via user" <user@kylin.apache.org> > wrote: > > Thank you Xiaoxiang for answering my previous question > > 1. For previous question 1, if I can ingest data near real-time into Hive > table, can that near realtime data be queried in Kylin insights windows by > SQL query almost instantly? If not then how can I reflect near > realtime data in (Kylin insights Window as well as in PowerBI report which > connect to Kylin via mez)? > > 2. For previous question 2, if DS/AI team cannot access Kylin parquet file > via java/python/scala then can they: > > 2.1) access the Hive Star schema table? > 2.2) access kylin cube via API? > 2.3) access computed fields of kylin cube via API > 2.4 access kylin model's measures via API > > Thank you very much > > On Mon, Nov 13, 2023 at 3:53 PM Xiaoxiang Yu <x...@apache.org> wrote: > >> Hi, >> Question 1: >> You are almost right. >> If the Cube not ready, Kylin will use SparkSQL to execute query directly >> on original tables. >> >> Question 2: >> It is possible but very hard. >> The index data are saved in Parquet format, it is possible to read them >> by Spark, but the columns' name are encoded >> so you don't understand which columns are useful to you. The mapping >> from parquet files' >> columns to Model's dimensions or measures is stored Kylin's metastore, so >> the knowledge of Kylin source code >> is required to make good use of model/index files when reading them >> directly. >> >> If we have a Python library(like >> https://github.com/Kyligence/kylinpy/tree/master) which provide >> the ability that you can send SQL to Kylin. Will it be helpful to your >> Data science team? >> Following is an example. >> >> >> ``` >> >>> import sqlalchemy as sa >> >>> import pandas as pd >> >>> kylin_engine = sa.create_engine('kylin://ADMIN:KYLIN@sandbox >> /learn_kylin?timeout=60&is_debug=1') >> >>> sql = 'select * from kylin_sales limit 10' >> >>> pd.read_sql(sql, kylin_engine) >> >> ``` >> >> >> >> >> -- >> *Best wishes to you ! * >> *From :**Xiaoxiang Yu* >> >> >> At 2023-11-13 16:02:20, "Nam Đỗ Duy via user" <user@kylin.apache.org> >> wrote: >> >> Hi Xiaoxiang, >> >> Basically you can imagine the scenario that there will be3 teams who will >> be using Kylin's Cube: >> >> a) Data analyst team (DA) who is using PowerBI (via ODBC or mez), >> superset to access kylin Cube as well. >> b) Data science team (DS) who is using Pyspark, SparkML currently >> assessing HDFS and parquet directly as raw file. >> c) AI team who is using various interfaces like Java, Python, Scala to >> assess HDFS and parquet directly as raw file. >> >> I have two questions: >> >> 1) For team a) DA: when using the ODBC or mez connector, if the Cube not >> ready then I guess the PowerBI is accessing HIVE parquet file, is n't it? >> 2) For DS/AI team: you see they are accessing the raw hdfs/parquet then >> how can Hive/Kylin provide more merits to these teams? For this question, I >> imagine of OLAP speed or computed metrics etc but I am not sure so please >> advise >> >> Thank you very much >> >> >> >> >> On Mon, Nov 13, 2023 at 2:40 PM Xiaoxiang Yu <x...@apache.org> wrote: >> >>> Do you have any specific business scenario? Looks like there is >>> not such real usecase at the moment. >>> >>> >>> >>> -- >>> *Best wishes to you ! * >>> *From :**Xiaoxiang Yu* >>> >>> >>> At 2023-11-13 11:36:35, "Nam Đỗ Duy via user" <user@kylin.apache.org> >>> wrote: >>> >>> Dear Sir/Madam >>> >>> I am persuading my company to use kylin as olap platform so please >>> kindly share with me (inbox me if you hesitate to share publicly) your real >>> use-cases to help me answer our boss’s question: >>> >>> 1. Which companies are using kylin now >>> 2. How do you use kylin’s capabilities in your AI/ML projects >>> >>> Thank you very much for your valuable time and support >>> >>>