Re: Partition column order in rewrite manifests

Renjie Liu Tue, 30 Jan 2024 17:47:31 -0800

Sounds reasonable to me.

On Wed, Jan 31, 2024 at 7:56 AM <russell.spit...@gmail.com> wrote:


> Sounds like a reasonable thing to add? Maybe we could check cardinality to
> pick out the default order as well?
> Sent from my iPhone
>
> On Jan 30, 2024, at 3:50 PM, Jack Ye <yezhao...@gmail.com> wrote:
>
> 
> Hi everyone,
>
> Today, the rewrite manifest procedure always orders the data files based
> on their *data_file.partition* value. Specifically, it sorts data files
> that have the same partition value, and then does a repartition by range
> based on the target number of manifest files (ref
> <https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java#L257-L258>
> ),
>
> I notice that this approach does not always yield the best performance for
> scan planning because the resulting manifest entries order is basically
> based on the default order of the partition columns.
>
> For example, consider a table partitioned by columns a and b. By default
> the rewrite procedure will organize manifest entries based on column a and
> then b. If most of my queries are using b as the predicate, rewriting
> manifests by sorting first against column b and then a will yield a much
> shorter scan planning time, because all manifest entries with similar b
> values are close together, and manifest list can be used to prune many
> files already without opening the manifest files.
>
> This happens a lot for cases like b is an event time timestamp column,
> which is not the first partition column, but actually the column that is
> read most frequently in every query.
>
> Translated to code, this means we can benefit from something like:
>
> SparkActions.rewriteManifests(table)
>   .sort("b", "a")
>   .commit()
>
> Any thoughts?
>
> Best,
> Jack Ye
>
>

Re: Partition column order in rewrite manifests

Reply via email to