Hi Gustavo, Not too familiar with the Airflow user base/use cases, but we had to consider similar things when decided what to do with `CREATE EXTERNAL TABLE ice_table PARTITIONED BY ...` Hive queries. See: https://github.com/apache/iceberg/pull/1917 <https://github.com/apache/iceberg/pull/1917>
The decision there was, that even thought the user issued a command to create a partitioned Hive table, we created an unpartitioned Hive table, where the backing Iceberg table was using identity partitions for the originally requested columns. Hope this helps a bit. Thanks, Peter > On Mar 2, 2021, at 03:38, Gustavo Torres Torres > <gustavo.tor...@airbnb.com.INVALID> wrote: > > Hey folks, > > Lately I've been thinking about integration between Airflow & Iceberg for a > smooth transition from Hive-based tables to Iceberg ones and would like to > hear about your experience. Specifically about Iceberg partition sensors in > Airflow. > > From the way I see it, there are two ways to go about this (at least for > Hive-based catalogs): > > Modify our Hive Metastore API so that partitions-APIs are handled directly by > the Iceberg API. This has the advantage of being mostly transparent to users > but has the downside of being confusing since Iceberg creates tables with the > Hive catalog as external non-partitioned tables. > Create a separate sensor that makes it clear that we are sensing over an > Iceberg table. This is probably the most straightforward approach, but if we > do this we would probably need to do the same for any tool that used the > metastore to get partition information. > > Would love to hear what your experiences have been. > Thanks