[
https://issues.apache.org/jira/browse/CALCITE-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856355#comment-17856355
]
Mihai Budiu commented on CALCITE-6372:
--------------------------------------
Here is some relevant discussion from the DEV list:
I would regard this as two separate but related things: a new SQL syntax for
joins, and a new relational operator. It is definitely worth keeping them
separate; the operator will not map 1-1 to the syntax, may require its input to
input to be sorted, and of course we would want queries to be able to use the
operator even if they didn’t use the syntax.
The relational operator can have physical implementations in various calling
conventions. Or even flags extending existing algorithms (e.g. add a
‘keepAtMostOneOnLeft’ flag to EnumerableMergeJoin).
Regarding whether to represent the operator as a subclass of Join or just a
subclass of BiRel. I recommend making it a subclass of join, but we have to
take care that rewrite rules and metadata rules designed to apply to regular
joins do not accidentally apply to these joins. We’ve already done that with
semi-join, so it shouldn’t be too hard to follow those breadcrumbs.
I recently read “The Complete Story of Joins (in HyPer)”, which contains some
other interesting and useful join variants: dependent join and mark join. We
should consider adding these as relational operators, in the same way that we
add asof-join.
Julian
[1]
http://btw2017.informatik.uni-stuttgart.de/slidesandpapers/F1-10-37/paper_web.pdf
> Support ASOF joins
> ------------------
>
> Key: CALCITE-6372
> URL: https://issues.apache.org/jira/browse/CALCITE-6372
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.36.0
> Reporter: Mihai Budiu
> Priority: Minor
>
> Seems that this new kind of JOIN named AS OF is very useful for processing
> time-series data. Here is some example documentation from Snowflake:
> https://docs.snowflake.com/en/sql-reference/constructs/asof-join
> The semantics is similar to a traditional join, but the result always
> contains at most one record from the left side, with the last matching
> record on the right side (where "time" is any value that can be compared for
> inequality). This can be expressed in SQL, but it looks very cumbersome,
> using a JOIN, a GROUP BY, and then an aggregation to keep the last value.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)