Re: Kylin real usecase on AI/ML (data science) project

2023-11-13 Thread Nam Đỗ Duy via user
Hi Xiaoxiang, Basically you can imagine the scenario that there will be3 teams who will be using Kylin's Cube: a) Data analyst team (DA) who is using PowerBI (via ODBC or mez), superset to access kylin Cube as well. b) Data science team (DS) who is using Pyspark, SparkML currently assessing HDFS

Re: Kylin real usecase on AI/ML (data science) project

2023-11-13 Thread Xiaoxiang Yu
Hi, Question 1: You are almost right. If the Cube not ready, Kylin will use SparkSQL to execute query directly on original tables. Question 2: It is possible but very hard. The index data are saved in Parquet format, it is possible to read them by Spark, but the columns' name are encoded so y

Re: Kylin real usecase on AI/ML (data science) project

2023-11-13 Thread Nam Đỗ Duy via user
Thank you Xiaoxiang for answering my previous question 1. For previous question 1, if I can ingest data near real-time into Hive table, can that near realtime data be queried in Kylin insights windows by SQL query almost instantly? If not then how can I reflect near realtime data in (Kylin insight

Re: Kylin real usecase on AI/ML (data science) project

2023-11-13 Thread Xiaoxiang Yu
1. Query them instantly is not possible, you need to trigger a build job and wait it completed, it will cost about 5-30 mintues in most cases. So the delay caused by Kylin is 5-30 minites. 2. DS/AI can send SQL query using Python and get the result(if kylinpy works well), just like you do in K

Re: Kylin real usecase on AI/ML (data science) project

2023-11-13 Thread Nam Đỗ Duy via user
Thank you Xiaoxiang, 1. For my question of near real time data: this scenario is not about querying the cube (index), I am mentioning the query against the Hive table only: is that possible to instantly querying the non_cube data if the data is already in Hive? Best regards On Mon, Nov 13, 2023

Re: Kylin real usecase on AI/ML (data science) project

2023-11-13 Thread Xiaoxiang Yu
Yes, you are right. -- Best wishes to you ! From :Xiaoxiang Yu At 2023-11-13 17:57:59, "Nam Đỗ Duy via user" wrote: Thank you Xiaoxiang, 1. For my question of near real time data: this scenario is not about querying the cube (index), I am mentioning the query against the Hive table