Re: [DISCUSS] Flink's supported APIs and Hive query syntax

Jingsong Li Tue, 08 Mar 2022 01:15:36 -0800

Thanks all for your discussions.

I'll share my opinion here:


1. Hive SQL and Hive-like SQL are the absolute mainstay of current
Batch ETL in China. Hive+Spark (HiveSQL-like)+Databricks also occupies
a large market worldwide.

- Unlike OLAP SQL (such as presto, which is ansi-sql rather than hive
sql), Batch ETL is run periodically, which means that a large number
of Batch Pipelines have already been built, and if they need to be
migrated to a new system, it will be extremely costly to migrate the
SQLs.

2. Our current Hive dialect is immature and we need to put more effort
to decouple it from the flink planner.

Best,
Jingsong

On Tue, Mar 8, 2022 at 4:27 PM Zou Dan <zoud...@163.com> wrote:
>
> Hi Martijn,
> Thanks for bringing this up.
> Hive SQL (using in Hive & Spark) plays an important role in batch processing, 
> it has almost become de facto standard in batch processing. In our company, 
> there are hundreds of thousands of spark jobs each day.
> IMO, if we want to promote Flink batch, Hive syntax compatibility is a 
> crucial point of it.
> Thanks to this feature, we have migrated 800+ Spark jobs to Flink smoothly.
>
> So, I quite agree with putting more effort into Hive syntax compatibility.
>
> Best,
> Dan Zou
>
> 2022年3月7日 19:23，Martijn Visser <martijnvis...@apache.org> 写道：
>
> query
>
>

Re: [DISCUSS] Flink's supported APIs and Hive query syntax

Reply via email to