Is this similar to Iceberg's hidden partitioning
<https://iceberg.apache.org/docs/latest/partitioning/#icebergs-hidden-partitioning>?
Check out the details in the spec:
https://iceberg.apache.org/spec/#partition-transforms

On Fri, Mar 24, 2023 at 2:52 PM Alex Cruise <a...@cluonflux.com> wrote:

> On Fri, Mar 24, 2023 at 1:46 PM John Zhuge <jzh...@apache.org> wrote:
>
>> Have you checked out SparkCatalog
>> <https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java>
>>  in
>> Apache Iceberg project? More docs at
>> https://iceberg.apache.org/docs/latest/spark-configuration/#catalogs
>>
>
> No, I hadn't seen that one yet, thanks!
>
> Another question: our partitions have no useful uniqueness criteria other
> than a storage URL which should never be exposed to user-space. Our
> "primary" index is a timestamp, and multiple partitions within a table can
> have overlapping time ranges. We support an additional shard key but it's
> optional. Is there something like partition discovery in DataSourceV2 where
> I should list all the (potentially many thousands) of partitions for a
> table, or can I leave them unpopulated until query planning time, when time
> range predicates often have extremely high selectivity?
>
> Thanks!
>
> -0xe1a
>
>>

-- 
John Zhuge

Reply via email to