Hi Andrew,

Yes This aligned also with my debugging. In My Kenn's reply you can see a
sql test which I wrote in Calcite. Somehow Calcite does not have this issue
with the 1.28 version.

!use post
!set outputformat mysql

#Test aliases with with clause
WITH tempTable(id, v) AS (select "hr"."emps"."empid" as id,
"hr"."emps"."name" as v from "hr"."emps")
SELECT tempTable.id as id, tempTable.v as "value" FROM tempTable WHERE
tempTable.v <> '11' ;
+-----+-----------+
| ID  | value     |
+-----+-----------+
| 100 | Bill      |
| 110 | Theodore  |
| 150 | Sebastian |
| 200 | Eric      |
+-----+-----------+
(4 rows)

!ok


On Wed, Jan 25, 2023 at 6:08 PM Andrew Pilloud <apill...@google.com> wrote:

> Yes, that worked.
>
> The issue does not occur if I disable all of the following planner rules:
> CoreRules.FILTER_CALC_MERGE, CoreRules.PROJECT_CALC_MERGE,
> LogicalCalcMergeRule.INSTANCE (which wraps CoreRules.CALC_MERGE),
> and BeamCalcMergeRule.INSTANCE (which wraps CoreRules.CALC_MERGE).
>
> All the rules share a common call to RexProgramBuilder.mergePrograms, so I
> suspect the problem lies there. I spent some time looking but wasn't able
> to find it by code inspection, it looks like this code path is doing the
> right thing with names. I'll spend some time tomorrow trying to reproduce
> this on pure Calcite.
>
> Andrew
>
>
> On Tue, Jan 24, 2023 at 8:24 PM Talat Uyarer <tuya...@paloaltonetworks.com>
> wrote:
>
>> Hi Andrew,
>>
>> Thanks for writing a test for this use case. Without Where clause it
>> works as expected on our test cases also too. Please add where clause on
>> second select. With the below query it does not return column names. I
>> tested on my local also.
>>
>> WITH tempTable (id, v) AS (SELECT f_int as id, f_string as v FROM
>> PCOLLECTION) SELECT id AS fout_int, v AS fout_string FROM tempTable WHERE
>> id > 1
>>
>> Thanks
>>
>> On Tue, Jan 24, 2023 at 5:28 PM Andrew Pilloud <apill...@google.com>
>> wrote:
>>
>>> +d...@beam.apache.org <d...@beam.apache.org>
>>>
>>> I tried reproducing this but was not successful, the output schema was
>>> as expected. I added the following to BeamSqlMultipleSchemasTest.java at
>>> head. (I did discover that  PAssert.that(result).containsInAnyOrder(output)
>>> doesn't validate column names however.)
>>>
>>>   @Test
>>>   public void testSelectAs() {
>>>     PCollection<Row> input = pipeline.apply(create(row(1, "strstr")));
>>>
>>>     PCollection<Row> result =
>>>         input.apply(SqlTransform.query("WITH tempTable (id, v) AS
>>> (SELECT f_int as id, f_string as v FROM PCOLLECTION) SELECT id AS fout_int,
>>> v AS fout_string FROM tempTable"));
>>>
>>>     Schema output_schema =
>>>
>>> Schema.builder().addInt32Field("fout_int").addStringField("fout_string").build();
>>>     assertThat(result.getSchema(), equalTo(output_schema));
>>>
>>>     Row output = Row.withSchema(output_schema).addValues(1,
>>> "strstr").build();
>>>     PAssert.that(result).containsInAnyOrder(output);
>>>     pipeline.run();
>>>   }
>>>
>>> On Tue, Jan 24, 2023 at 8:13 AM Talat Uyarer <
>>> tuya...@paloaltonetworks.com> wrote:
>>>
>>>> Hi Kenn,
>>>>
>>>> Thank you for replying back to my email.
>>>>
>>>> I was under the same impression about Calcite. But I wrote a test on
>>>> Calcite 1.28 too. It is working without issue that I see on BEAM
>>>>
>>>> Here is my test case. If you want you can also run on Calcite. Please
>>>> put under core/src/test/resources/sql as text file. and Run CoreQuidemTest
>>>> class.
>>>>
>>>> !use post
>>>> !set outputformat mysql
>>>>
>>>> #Test aliases with with clause
>>>> WITH tempTable(id, v) AS (select "hr"."emps"."empid" as id, 
>>>> "hr"."emps"."name" as v from "hr"."emps")
>>>> SELECT tempTable.id as id, tempTable.v as "value" FROM tempTable WHERE 
>>>> tempTable.v <> '11' ;
>>>> +-----+-----------+
>>>> | ID  | value     |
>>>> +-----+-----------+
>>>> | 100 | Bill      |
>>>> | 110 | Theodore  |
>>>> | 150 | Sebastian |
>>>> | 200 | Eric      |
>>>> +-----+-----------+
>>>> (4 rows)
>>>>
>>>> !ok
>>>>
>>>>
>>>> On Mon, Jan 23, 2023 at 10:16 AM Kenneth Knowles <k...@apache.org>
>>>> wrote:
>>>>
>>>>> Looking at the code that turns a logical CalcRel into a BeamCalcRel I
>>>>> do not see any obvious cause for this:
>>>>> https://github.com/apache/beam/blob/b3aa2e89489898f8c760294ba4dba2310ac53e70/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamCalcRule.java#L69
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_beam_blob_b3aa2e89489898f8c760294ba4dba2310ac53e70_sdks_java_extensions_sql_src_main_java_org_apache_beam_sdk_extensions_sql_impl_rule_BeamCalcRule.java-23L69&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=BkW1L6EF7ergAVYDXCo-3Vwkpy6qjsWAz7_GD7pAR8g&m=KXc2qSceL6qFbFnQ_2qUOHr9mKuc6zYY8rJTNZC8p_wTcNs4M6mHQoCuoc4JfeaA&s=KjzplEf29oFB6uivvdjixpQiArWtfV-1SXpALL-ugEM&e=>
>>>>>
>>>>> I don't like to guess that upstream libraries have the bug, but in
>>>>> this case I wonder if the alias is lost in the Calcite optimizer rule for
>>>>> merging the projects and filters into a Calc.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Mon, Jan 23, 2023 at 10:13 AM Kenneth Knowles <k...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I am not sure I understand the question, but I do see an issue.
>>>>>>
>>>>>> Context: "CalcRel" is an optimized relational operation that is
>>>>>> somewhat like ParDo, with a small snippet of a single-assignment DSL
>>>>>> embedded in it. Calcite will choose to merge all the projects and filters
>>>>>> into the node, and then generates Java bytecode to directly execute the 
>>>>>> DSL.
>>>>>>
>>>>>> Problem: it looks like the CalcRel has output columns with aliases
>>>>>> "id" and "v" where it should have output columns with aliases "id" and
>>>>>> "value".
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Thu, Jan 19, 2023 at 6:01 PM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> Adding: @Andrew Pilloud <apill...@google.com> @Kenneth Knowles
>>>>>>> <k...@google.com>
>>>>>>>
>>>>>>> On Thu, Jan 12, 2023 at 12:31 PM Talat Uyarer via user <
>>>>>>> user@beam.apache.org> wrote:
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> I am using Beam 2.43 with Calcite SQL with Java.
>>>>>>>>
>>>>>>>> I have a query with a WITH clause and some aliasing. Looks like
>>>>>>>> Beam Query optimizer after optimizing my query, it drops Select 
>>>>>>>> statement's
>>>>>>>> aliases. Can you help me to identify where the problem is ?
>>>>>>>>
>>>>>>>> This is my query
>>>>>>>> INFO: SQL:
>>>>>>>> WITH `tempTable` (`id`, `v`) AS (SELECT
>>>>>>>> `PCOLLECTION`.`f_nestedRow`.`f_nestedInt` AS `id`,
>>>>>>>> `PCOLLECTION`.`f_nestedRow`.`f_nestedString` AS `v`
>>>>>>>> FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`) (SELECT
>>>>>>>> `tempTable`.`id` AS `id`, `tempTable`.`v` AS `value`
>>>>>>>> FROM `tempTable` AS `tempTable`
>>>>>>>> WHERE `tempTable`.`v` <> '11')
>>>>>>>>
>>>>>>>> This is Calcite Plan look at LogicalProject(id=[$0], value=[$1]) in
>>>>>>>> SQL plan.
>>>>>>>>
>>>>>>>> Jan 12, 2023 12:19:08 PM
>>>>>>>> org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner 
>>>>>>>> convertToBeamRel
>>>>>>>> INFO: SQLPlan>
>>>>>>>> LogicalProject(id=[$0], value=[$1])
>>>>>>>>   LogicalFilter(condition=[<>($1, '11')])
>>>>>>>>     LogicalProject(id=[$1.f_nestedInt], v=[$1.f_nestedString])
>>>>>>>>       BeamIOSourceRel(table=[[beam, PCOLLECTION]])
>>>>>>>>
>>>>>>>> But Beam Plan does not have a LogicalProject(id=[$0], value=[$1])
>>>>>>>> or similar.
>>>>>>>>
>>>>>>>> Jan 12, 2023 12:19:08 PM
>>>>>>>> org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner 
>>>>>>>> convertToBeamRel
>>>>>>>> INFO: BEAMPlan>
>>>>>>>> BeamCalcRel(expr#0..1=[{inputs}], expr#2=[$t1.f_nestedInt],
>>>>>>>> expr#3=[$t1.f_nestedString], expr#4=['11':VARCHAR], expr#5=[<>($t3, 
>>>>>>>> $t4)],
>>>>>>>> id=[$t2], v=[$t3], $condition=[$t5])
>>>>>>>>   BeamIOSourceRel(table=[[beam, PCOLLECTION]])
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>

Reply via email to