Hi all, A while back, someone raised on this list (https://lists.apache.org/thread/35zzzh2jgorhx7q2xksp7rwxnt6gl2zx) that once Polaris is bootstrapped, simple operational questions ("how many tables in this namespace?", "how many snapshots?", "any small files?") force you to switch to Spark/Trino/pyiceberg and write a script. It gave me an idea for a standalone SQL shell (ANTLR grammar + REST catalog client, ships as a shadow jar). Code:
https://github.com/bbejeck/polaris/tree/add-sql-module/extensions/sql-engine End-to-end demo (docker-compose + MinIO, runs locally): https://github.com/bbejeck/polaris/tree/add-sql-module/extensions/sql-engine/demo It adds Iceberg-aware statements like SHOW TABLES, DESCRIBE STATS, SHOW TABLE LOCATION/POLICIES, DIAGNOSE TABLE, and EXPLAIN so you can poke at namespaces, snapshots, small-file diagnostics, etc., without firing up Spark or Trino. The demo above spins up Polaris + MinIO, seeds three Iceberg tables, and lets you try all of the statements above in a few minutes. A few things worth stating upfront, because the "yet another SQL dialect" worry is real: - Not trying to compete with Spark/Trino/Doris SQL - The SELECT support is "peek at a table from the shell", not "run analytics". - Dialect baseline I'd propose: small SQL-92 read-only subset plus a handful of named Polaris extension statements (DESCRIBE STATS, DIAGNOSE TABLE, etc.). Easy to maintain, intentionally orthogonal to the engines. If anyone thinks this could be useful, I'd love to discuss the next steps. I'm new to the Polaris community, but I know in Apache Kafka, something like this would require a KIP, so I'm also willing to do a more formal design doc. Thanks, — Bill Bejeck
