Hi all, My name is Rahil Chertara, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal for a new Scan API that will be utilized by the RESTCatalog. The process for table scan planning is currently done within client engines such as Apache Spark. By moving scan functionality to the RESTCatalog, we can integrate Iceberg table scans with external services, which can lead to several benefits.
For example, we can leverage caching and indexes on the server side to improve planning performance. Furthermore, by moving this scan logic to the RESTCatalog, non-JVM engines can integrate more easily. This all can be found in the detailed proposal below. Feel free to comment, and add your suggestions . Detailed proposal: https://docs.google.com/document/d/1FdjCnFZM1fNtgyb9-v9fU4FwOX4An-pqEwSaJe8RgUg/edit#heading=h.cftjlkb2wh4h Github POC: https://github.com/apache/iceberg/pull/9252 Regards, Rahil Chertara Amazon EMR & Athena rcher...@amazon.com<mailto:rcher...@amazon.com>