Hi everyone, It’s been over six months since Iceberg views were introduced (in version 1.5.0), and while we’ve seen some benefits—such as versioning and cross-engine recognition—there’s still a critical gap in terms of true interoperability. Although views created by one engine are recognized by another, they currently cannot be queried across engines as one engine cannot understand the SQL dialect of another engine.
To address this, I propose a two-step approach: *Step 1: A temporary solution to manually add SQL dialects.*It is already possible to do this using some java code but most of the view users interact via SQL, there’s currently no SQL-native method to manually add dialects from other engines into Iceberg views. I suggest adding a call procedure to enable this. I've recently worked on this and have submitted the following PRs for review: - Spark-3.5: Refactor BaseProcedure to support views [#11326 <https://github.com/apache/iceberg/pull/11326>] - Spark-3.5: Procedure to add view dialect [#11327 <https://github.com/apache/iceberg/pull/11327>] With these changes, Iceberg views could be queried across different engines using their respective dialects. Please review and provide feedback. *Step 2: A long-term solution for full view interoperability.*For a sustainable solution, we need to consider both open-source and commercial engines (e.g., Snowflake, Dremio, BigQuery). A framework that can automatically translate SQL dialects from one engine to another seems ideal. I’ve explored Coral*[1]*, which works well with open-source engines like Hive, Trino, and Spark, but doesn’t fully cover commercial engines. In contrast, SQLGlot*[2]* seems to be a better fit, as it supports 23 SQL dialects and would provide a more robust solution for interoperability. But SQLGlot is a python project. So, we need REST or Py4J to work with SQLGlot. Additionally, I looked into Substrait*[3]* as a potential intermediate representation for views, but as per my understanding, integration with commercial engines isn’t quite ready yet. This technology has broader applications beyond views, including table planning, and may be worth revisiting in the future. Given this, I believe that incorporating a translation layer with SQLGlot is the right direction. If no one is currently working on view interoperability, I’d be happy to explore this further and take the lead. I welcome any thoughts or feedback from those involved in related projects. Looking forward to hearing your opinions. [1] https://github.com/linkedin/coral [2] https://sqlglot.com/sqlglot.html [3] https://substrait.io/ - Ajantha