Is this similar to Iceberg's hidden partitioning <https://iceberg.apache.org/docs/latest/partitioning/#icebergs-hidden-partitioning>? Check out the details in the spec: https://iceberg.apache.org/spec/#partition-transforms
On Fri, Mar 24, 2023 at 2:52 PM Alex Cruise <a...@cluonflux.com> wrote: > On Fri, Mar 24, 2023 at 1:46 PM John Zhuge <jzh...@apache.org> wrote: > >> Have you checked out SparkCatalog >> <https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java> >> in >> Apache Iceberg project? More docs at >> https://iceberg.apache.org/docs/latest/spark-configuration/#catalogs >> > > No, I hadn't seen that one yet, thanks! > > Another question: our partitions have no useful uniqueness criteria other > than a storage URL which should never be exposed to user-space. Our > "primary" index is a timestamp, and multiple partitions within a table can > have overlapping time ranges. We support an additional shard key but it's > optional. Is there something like partition discovery in DataSourceV2 where > I should list all the (potentially many thousands) of partitions for a > table, or can I leave them unpopulated until query planning time, when time > range predicates often have extremely high selectivity? > > Thanks! > > -0xe1a > >> -- John Zhuge