Re: [C++] output field names in Arrow Substrait

2022-04-23 Thread Jacques Nadeau
ost cases, assuming the > (Substrait-consumer-side) reproduced column-names are natural/standardized. > For example, in the cast_table example, the user would not need to specify > "value1" and "value2" as output field-names for cast_table because these > natural names

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Weston Pace
ample, in the cast_table example, the user would not need to specify > "value1" and "value2" as output field-names for cast_table because these > natural names would be the default. > > > Yaron. > ____ > From: Jeroen van Stra

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Yaron Gvili
From: Jeroen van Straten Sent: Wednesday, April 20, 2022 11:54 AM To: dev@arrow.apache.org Subject: Re: [C++] output field names in Arrow Substrait > > The names at the relation root are a different story, since they specify > > the column names for the produced table, but t

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Jeroen van Straten
t;>> key > >>> > > that > >>> > > > exists in all tables. This is a use case where the key should > (not > >>> > must) > >>> > > be > >>> > > > specified by name for convenience. > >>> > > > In b

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Li Jin
ying >>> > > > a string-name as a parameter to a relation that would later get >>> passed >>> > to >>> > > > an Arrow execution node in an options instance and used to >>> dynamically >>> > > set >>> > > > up field-names. This strin

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Li Jin
> > > > in the Substrait plan are output field-names of intermediate >> relations; >> > > > this in itself is not a problem, because these field-names can (in >> > > > principle) be recomputed to allow them to be matched to the >> dynamically >> > > >

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Li Jin
ts Substrait module) ends up with non-natural field-names like > > > > "FieldPath(1)" that fail to be matched. > > > > > > > > That's why this is not a Substrait specific problem, but one that is > > > > related to Ibis, Ibis-Substrait, and Arr

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Phillip Cloud
roblem, but one that is > > > related to Ibis, Ibis-Substrait, and Arrow together. We think it can be > > > resolved either by changing Substrait to assist Arrow by specifying the > > > field-names of intermediate relations in the plan, so that they are >

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Jeroen van Straten
schemata before options and node instances are created. We > > might come up with other solutions in this discussion, of course. Either > > way, for this second use case, the field-names of intermediate relations > > must be natural; they cannot be left as something like "Fi

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Phillip Cloud
Either > way, for this second use case, the field-names of intermediate relations > must be natural; they cannot be left as something like "FieldPath(1)". > > > Yaron. > ____ > From: Weston Pace > Sent: Tuesday, April 19, 2022 7:12 PM

Re: [C++] output field names in Arrow Substrait

2022-04-20 Thread Yaron Gvili
field-names of intermediate relations must be natural; they cannot be left as something like "FieldPath(1)". Yaron. From: Weston Pace Sent: Tuesday, April 19, 2022 7:12 PM To: dev@arrow.apache.org Cc: Li Jin Subject: Re: [C++] output field names in Arrow

Re: [C++] output field names in Arrow Substrait

2022-04-19 Thread Weston Pace
user writing an Ibis expression and the remaining steps - > Substrait compilation, serialization to protobuf,, deserialization to an > Arrow plan, and execution of the plan - are done systematically. > > > Yaron. > > From: Weston Pace > Sent: Tuesday, April 19, 2022

Re: [C++] output field names in Arrow Substrait

2022-04-19 Thread Yaron Gvili
___ From: Weston Pace Sent: Tuesday, April 19, 2022 1:01 PM To: dev@arrow.apache.org Cc: Li Jin Subject: Re: [C++] output field names in Arrow Substrait Hi Yaron, I think you might have forgotten the links for [1][2][3] so I'm not entirely sure of the context. Are you going from Sub

Re: [C++] output field names in Arrow Substrait

2022-04-19 Thread Weston Pace
Hi Yaron, I think you might have forgotten the links for [1][2][3] so I'm not entirely sure of the context. Are you going from Substrait to an Arrow execution plan? Or are you going from an Arrow execution plan to Substrait? For Substrait -> Arrow most of our execution nodes should take in a Fie

[C++] output field names in Arrow Substrait

2022-04-19 Thread Yaron Gvili
Hi, We ran into an issue due to the fact that, for intermediate relations, Substrait does not automatically compute output field names nor allows one to explicitly name output fields [1]. This leads to trouble when one needs to refer to these output fields by name [2]. We run into this trouble