Hi Vladimir,

I'm certain my design has room for improvement and would love any
suggestions. Here is the use case.

I'm working on Dask-SQL [1]. We wrap Calcite with a Python layer and use
Calcite to parse, validate, and generate relational algebra. From the
relational algebra generated we in turn convert those to Dask Python (and
therefore Dataframe) API calls. Leaving out a lot of detail in a nutshell
this is the order of what happens.

1.) Parse SQL Python str to SqlNode
2.) Generate RelNode from SqlNode
3.) Convert each RexNode into a Python Pandas/cuDF Dataframe - this is the
step where I want to get the original SQL identifier at

For step 3 there are some large performance gains that can be achieved by
using "predicate pushdown" in the IO readers and for example only reading
certain columns from a Parquet or ORC file. The format needed to achieve
this is DNF and requires the original column names so those predicates can
be passed down into the implementation libraries. The problem is those
libraries already exist as CUDA C/C++ implementations and cannot be
modified.

Does that make sense? If there is a more intelligent way to conditional
predicates from the SQL query, even if it isn't at the Rex level I would
love to hear suggestions

[1] - https://github.com/dask-contrib/dask-sql

On Tue, Dec 21, 2021 at 1:05 PM Vladimir Ozerov <ppoze...@gmail.com> wrote:

> Hi Jeremy,
>
> Could you please share the use case behind this requirement? In the general
> case, it is not possible to link RelNode's attributes to specific
> identifiers. For this reason, an attempt to extract such identifier from
> any "rel" except for the RelRoot might indicate a design issue.
>
> Regards,
> Vladimir.
>
> вт, 21 дек. 2021 г. в 20:34, Jeremy Dyer <jdy...@gmail.com>:
>
> > Hello,
> >
> > Is it possible to get the original SQL identifier from an instance of
> > RexInputRef? For example given a simple query like
> >
> > SELECT id FROM employees WHERE fname = 'adam'
> >
> > Instead of the ordinal name generated by RexInputRef ($11, for example).
> I
> > would like to find the original SQL identifier (fname, for example)
> >
> > Thanks,
> > Jeremy Dyer
> >
>

Reply via email to