Sounds like a reasonable thing to add? Maybe we could check cardinality to pick out the default order as well?
Sent from my iPhone

On Jan 30, 2024, at 3:50 PM, Jack Ye <yezhao...@gmail.com> wrote:


Hi everyone,

Today, the rewrite manifest procedure always orders the data files based on their data_file.partition value. Specifically, it sorts data files that have the same partition value, and then does a repartition by range based on the target number of manifest files (ref),

I notice that this approach does not always yield the best performance for scan planning because the resulting manifest entries order is basically based on the default order of the partition columns. 

For example, consider a table partitioned by columns a and b. By default the rewrite procedure will organize manifest entries based on column a and then b. If most of my queries are using b as the predicate, rewriting manifests by sorting first against column b and then a will yield a much shorter scan planning time, because all manifest entries with similar b values are close together, and manifest list can be used to prune many files already without opening the manifest files.

This happens a lot for cases like b is an event time timestamp column, which is not the first partition column, but actually the column that is read most frequently in every query.

Translated to code, this means we can benefit from something like:

SparkActions.rewriteManifests(table)
  .sort("b", "a")
  .commit()

Any thoughts?

Best,
Jack Ye

Reply via email to