On Fri, Mar 24, 2023 at 1:46 PM John Zhuge <jzh...@apache.org> wrote:

> Have you checked out SparkCatalog
> <https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java>
>  in
> Apache Iceberg project? More docs at
> https://iceberg.apache.org/docs/latest/spark-configuration/#catalogs
>

No, I hadn't seen that one yet, thanks!

Another question: our partitions have no useful uniqueness criteria other
than a storage URL which should never be exposed to user-space. Our
"primary" index is a timestamp, and multiple partitions within a table can
have overlapping time ranges. We support an additional shard key but it's
optional. Is there something like partition discovery in DataSourceV2 where
I should list all the (potentially many thousands) of partitions for a
table, or can I leave them unpopulated until query planning time, when time
range predicates often have extremely high selectivity?

Thanks!

-0xe1a

>

Reply via email to