gene-bordegaray commented on code in PR #22207:
URL: https://github.com/apache/datafusion/pull/22207#discussion_r3261514072
##########
datafusion/physical-expr/src/partitioning.rs:
##########
@@ -133,13 +136,225 @@ impl Display for Partitioning {
.join(", ");
write!(f, "Hash([{phy_exprs_str}], {size})")
}
+ Partitioning::Range(range) => write!(f, "{range}"),
Partitioning::UnknownPartitioning(size) => {
write!(f, "UnknownPartitioning({size})")
}
}
}
}
+/// Physical range partitioning.
+///
+/// [`RangePartitioning`] describes range bounds over one or more physical
+/// expressions. Each [`RangePartition`] represents one output partition and
must
+/// contain exactly one [`RangeInterval`] for each partition expression.
+///
+/// The source declaring this partitioning is responsible for ensuring that,
for
+/// every emitted row, the row belongs to exactly one partition and is emitted
by
+/// that partition. The declared ranges do not need to cover values that the
plan
+/// cannot emit.
+///
+/// Each lower and upper bound explicitly records whether it is inclusive.
+/// Unbounded sides are represented with `None`, bound values should be
non-null
+/// until null routing semantics are defined.
+///
+/// For example, a scan can declare date and city range partitions as:
+///
+/// ```text
+/// exprs = [date, city]
+///
+/// partition 0:
+/// date in [2021-01-01, 2022-01-01)
+/// city in [Allston, Boston)
+///
+/// partition 1:
+/// date in [2021-01-01, 2022-01-01)
+/// city in [Boston, NYC)
+/// ```
Review Comment:
This is supposed to be multi-dimensional. For example we are partitoned on
independent `id`, `time` keys thus this would accurately represent our layout.
I do see what you are saying about the join needing a single key, which will
work for our case as well. But maybe this can start as single dimension with
compound keys and extend if the use case arises to avoid complexity?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]